<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Genet.</journal-id>
<journal-title>Frontiers in Genetics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Genet.</abbrev-journal-title>
<issn pub-type="epub">1664-8021</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fgene.2012.00202</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genetics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Integrative Genomics: Quantifying Significance of Phenotype-Genotype Relationships from Multiple Sources of High-Throughput Data</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Gamazon</surname> <given-names>Eric R.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Huang</surname> <given-names>R. Stephanie</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Dolan</surname> <given-names>M. Eileen</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Cox</surname> <given-names>Nancy J.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Im</surname> <given-names>Hae Kyung</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<xref ref-type="author-notes" rid="fn001">&#x0002A;</xref>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Medicine, University of Chicago</institution> <country>Chicago, IL, USA</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Human Genetics, University of Chicago</institution> <country>Chicago, IL, USA</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Health Studies, University of Chicago</institution> <country>Chicago, IL, USA</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Barbara E. Stranger, Brigham and Women&#x02019;s Hospital, USA</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Ali Torkamani, University of California at San Diego, USA; Marylyn D. Ritchie, The Pennsylvania State University, USA; Qingying Meng, University of California Los Angeles, USA</p></fn>
<fn fn-type="corresp" id="fn001"><p>&#x0002A;Correspondence: Hae Kyung Im, Department of Health Studies, University of Chicago, 5841&#x02009;S Maryland Avenue MC 2007, Chicago, IL 60637, USA. e-mail: <email>haky&#x00040;uchicago.edu</email></p></fn>
<fn fn-type="other" id="fn002"><p>This article was submitted to Frontiers in Statistical Genetics and Methodology, a specialty of Frontiers in Genetics.</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>31</day>
<month>05</month>
<year>2013</year>
</pub-date>
<pub-date pub-type="collection">
<year>2012</year>
</pub-date>
<volume>3</volume>
<elocation-id>202</elocation-id>
<history>
<date date-type="received">
<day>15</day>
<month>05</month>
<year>2012</year>
</date>
<date date-type="accepted">
<day>20</day>
<month>09</month>
<year>2012</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2013 Gamazon, Huang, Dolan, Cox and Im.</copyright-statement>
<copyright-year>2013</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/3.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.</p></license>
</permissions>
<abstract>
<p>Given recent advances in the generation of high-throughput data such as whole-genome genetic variation and transcriptome expression, it is critical to come up with novel methods to integrate these heterogeneous datasets and to assess the significance of identified phenotype-genotype relationships. Recent studies show that genome-wide association findings are likely to fall in loci with gene regulatory effects such as expression quantitative trait loci (eQTLs), demonstrating the utility of such integrative approaches. When genotype and gene expression data are available on the same individuals, we and others developed methods wherein top phenotype-associated genetic variants are prioritized if they are associated, as eQTLs, with gene expression traits that are themselves associated with the phenotype. Yet there has been no method to determine an overall <italic>p</italic>-value for the findings that arise specifically from the integrative nature of the approach. We propose a computationally feasible permutation method that accounts for the assimilative nature of the method and the correlation structure among gene expression traits and among genotypes. We apply the method to data from a study of cellular sensitivity to etoposide, one of the most widely used chemotherapeutic drugs. To our knowledge, this study is the first statistically sound quantification of the overall significance of the genotype-phenotype relationships resulting from applying an integrative approach. This method can be easily extended to cases in which gene expression data are replaced by other molecular phenotypes of interest, e.g., microRNA or proteomic data. This study has important implications for studies seeking to expand on genetic association studies by the use of omics data. Finally, we provide an R code to compute the empirical false discovery rate when <italic>p</italic>-values for the observed and simulated phenotypes are available.</p>
</abstract>
<kwd-group>
<kwd>eQTLs</kwd>
<kwd>FDR</kwd>
<kwd>gene expression</kwd>
<kwd>genomics</kwd>
<kwd>GWAS</kwd>
<kwd>integrative genomics</kwd>
<kwd>permutation</kwd>
<kwd>phenotype</kwd>
</kwd-group>
<counts>
<fig-count count="5"/>
<table-count count="0"/>
<equation-count count="6"/>
<ref-count count="39"/>
<page-count count="7"/>
<word-count count="5694"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="introduction">
<title>Introduction</title>
<p>The availability of genome-wide datasets is facilitating unprecedented insights into various aspects of cellular processes. Technological advances (Metzker, <xref ref-type="bibr" rid="B22">2010</xref>) in high-throughput methods are contributing to new approaches in genomics, transcriptomics (Wang et al., <xref ref-type="bibr" rid="B36">2009</xref>), proteomics (Farnham, <xref ref-type="bibr" rid="B6">2009</xref>), and epigenomics (Laird, <xref ref-type="bibr" rid="B20">2010</xref>; Zhou et al., <xref ref-type="bibr" rid="B39">2011</xref>), allowing in-depth interrogation of diverse biological processes. A primary challenge from the tremendously heterogeneous and increasingly massive datasets is data integration &#x02013; a challenge that is inevitably bound to intensify with the deluge of these high-throughput datasets. Nevertheless, among the many exciting promises, integrative approaches are likely to yield a comprehensive map of genome function (Degner et al., <xref ref-type="bibr" rid="B3">2012</xref>) as well as a high-resolution view into the complex logic of biological systems (Hawkins et al., <xref ref-type="bibr" rid="B13">2010</xref>).</p>
<p>Indeed, while genome-wide association studies (GWAS) have identified thousands of common genetic variants associated with diseases and other complex human traits (Hindorff et al., <xref ref-type="bibr" rid="B14">2009</xref>), functional understanding of many of the variants remains elusive. Integrating other omics datasets into genome-wide analyses offers the potential to provide systematic insight into the mechanisms underlying the observed genotype-phenotype relationships. One common approach to the integration of functional data into GWAS is the use of expression quantitative trait loci (eQTL; Stranger et al., <xref ref-type="bibr" rid="B33">2007a</xref>; Duan et al., <xref ref-type="bibr" rid="B5">2008</xref>; Schadt et al., <xref ref-type="bibr" rid="B29">2008</xref>) information to expand on the nature of the genetic component to complex phenotypes (Gamazon et al., <xref ref-type="bibr" rid="B7">2010a</xref>; Nicolae et al., <xref ref-type="bibr" rid="B26">2010</xref>). Such an integrative approach is clearly extensible to the use of protein (Garge et al., <xref ref-type="bibr" rid="B10">2010</xref>) or microRNA quantitative trait loci (Gamazon et al., <xref ref-type="bibr" rid="B9">2012</xref>), indeed other functionally relevant features of the genome, to improve identification of functional variants.</p>
<p>Our group (Huang et al., <xref ref-type="bibr" rid="B15">2007a</xref>; Welsh et al., <xref ref-type="bibr" rid="B37">2009</xref>; Nicolae et al., <xref ref-type="bibr" rid="B26">2010</xref>) and others (Cheung et al., <xref ref-type="bibr" rid="B1">2003</xref>; Correa and Cheung, <xref ref-type="bibr" rid="B2">2004</xref>; Stranger et al., <xref ref-type="bibr" rid="B34">2007b</xref>; Nica et al., <xref ref-type="bibr" rid="B25">2010</xref>) have used the HapMap lymphoblastoid cell lines (LCLs) as a model for human genotype-phenotype relationships. The cell lines have been the subject of several whole-genome gene expression profiling studies (Montgomery et al., <xref ref-type="bibr" rid="B24">2010</xref>; Pickrell et al., <xref ref-type="bibr" rid="B28">2010</xref>; Stranger et al., <xref ref-type="bibr" rid="B35">2012</xref>) to identify functional loci (e.g., eQTLs) with potentially important links to SNP associations emerging from genome-wide studies. Furthermore, the cell lines have been utilized to identify the molecular consequences associated with various exposures (Dermitzakis, <xref ref-type="bibr" rid="B4">2012</xref>), such as drugs (Huang et al., <xref ref-type="bibr" rid="B16">2007b</xref>), small molecules, or pathogens (Ko et al., <xref ref-type="bibr" rid="B18">2009</xref>). For example, a three-way &#x0201C;triangle&#x0201D; model, correlating genotype, gene expression, and phenotype data, has been devised to identify genetic variants that contribute to chemotherapeutic-induced cytotoxicity through their effects on gene expression (Huang et al., <xref ref-type="bibr" rid="B16">2007b</xref>). Nevertheless, quantifying the significance of a finding from such an integrative approach remains to be fully addressed.</p>
</sec>
<sec sec-type="materials|methods" id="s1">
<title>Materials and Methods</title>
<sec>
<title>Functional integration</title>
<p>A simple approach to integrate high-throughput functional datasets (e.g., from studies of the transcriptome, proteome, or microRNAome) with genome-wide genotype data obtained from microarray- or sequencing-based studies is to select SNPs that meet certain functional criteria as illustrated in the example in Figure <xref ref-type="fig" rid="F1">1</xref>. In the first step of this example, SNPs are filtered by requiring that they be associated with genes whose expression levels are associated with the phenotype (Zhong et al., <xref ref-type="bibr" rid="B38">2010</xref>). In the next step, we further reduce the number of SNPs by requiring that they be associated with protein levels that are themselves associated with the phenotype. This process can continue using other omics datasets.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p><bold>The use of omics data to perform SNP filtering</bold>. A series of functional filters is applied to the original SNP set to perform data dimensional reduction. The reduced multiple testing burden may improve power to detect genotype associations with phenotype.</p></caption>
<graphic xlink:href="fgene-03-00202-g001.tif"/>
</fig>
<p>To simplify the description, we focus on the case in which only the gene (mRNA) expression data are integrated, which is depicted with the diagram in Figure <xref ref-type="fig" rid="F2">2</xref>. This triangle approach and variations thereof were proposed by Huang et al. (<xref ref-type="bibr" rid="B16">2007b</xref>) and others (Zhong et al., <xref ref-type="bibr" rid="B38">2010</xref>) and applied to an array of cellular phenotypes. The first step of this method aims to identify a set of gene expression traits associated with the given phenotype at an arbitrarily set <italic>p</italic>-value threshold, <italic>p</italic>&#x02009;&#x0003C;&#x02009;<italic>p</italic><sub>gene-phenotype</sub>. It is important to emphasize that this threshold, as in the subsequent thresholds to be defined below, is generally set arbitrarily. In practice, these thresholds are used to prioritize genes or SNPs for downstream analyses. Indeed, one aim of our study is to quantify the significance of an association from a triangle method regardless of the choice of thresholds used during the integrative process. The second step of the method is to identify SNPs that are associated with the selected gene expression traits again at an arbitrarily set threshold, <italic>p</italic>&#x02009;&#x0003C;&#x02009;<italic>p</italic><sub>SNP-gene</sub>. At a stringent threshold, this step maps the gene expression traits to genomic loci; this step thus identifies the eQTLs for the corresponding genes. Finally, in the last step of the triangle, the resulting SNPs are interrogated for association with the phenotype. Our primary aim is to describe a method to quantify the significance of the SNPs resulting from this multi-step &#x0201C;triangle&#x0201D; approach.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p><bold>The &#x0201C;triangle&#x0201D; method for integration of genotype, gene expression, and phenotype data</bold>. Through a series of steps, heterogeneous datasets, involving SNPs, gene expression and trait are integrated. At each step, a <italic>p</italic>-value threshold is applied. In general, the <italic>p</italic>-value threshold used is arbitrary; in practice, the choice allows for prioritization of genes or SNPs. The result of the triangle method is a set of SNP association <italic>p</italic>-values (represented by the &#x0201C;obs <italic>p</italic>&#x0201D; in the figure).</p></caption>
<graphic xlink:href="fgene-03-00202-g002.tif"/>
</fig>
</sec>
<sec>
<title>Na&#x000EF;ve FDR of selected SNPs</title>
<p>Since the triangle method is a multi-step approach that derives a final SNP set from a series of (potentially) increasingly stringent thresholds, it is reasonable to expect that such an approach should yield a final set with substantially reduced false discovery rates (FDRs) for association with the phenotype. A simple approach to assess the significance of the findings for this subset of SNPs would be to compute the FDR for them (Storey and Tibshirani, <xref ref-type="bibr" rid="B32">2003</xref>). We illustrate the problem of this approach in Figure <xref ref-type="fig" rid="F3">3</xref> in which we show the QQ plot of the associations after applying the triangle method to a simulated phenotype, which has no association with genotype. In this particular example, the first threshold <italic>p</italic><sub>gene-phenotype</sub> was set at 0.05 while <italic>p</italic><sub>SNP-gene</sub> was set at 5&#x02009;&#x000D7;&#x02009;10<sup>&#x02212;6</sup>. Circles above the red line represent SNPs with FDR&#x02009;&#x0003C;&#x02009;0.05. (Strictly speaking, circles with <italic>p</italic>-values less than the one with the largest <italic>p</italic>-value that goes above the red line has FDR&#x02009;&#x0003C;&#x02009;0.05.) As the figure indicates, the triangle method may yield several spurious associations, if we rely on a &#x0201C;na&#x000EF;ve&#x0201D; FDR approach. This example shows the need to develop a more sophisticated approach to estimate the significance of results in this integrative context.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p><bold>Traditional FDR applied to the triangle method for simulated data</bold>. The triangle method may yield numerous highly significant SNP associations, based on the traditional FDR approach, from a simulated phenotype. In this case, several significant SNPs are obtained from simulated data although none should be observed.</p></caption>
<graphic xlink:href="fgene-03-00202-g003.tif"/>
</fig>
</sec>
<sec>
<title>Simulating the null distribution</title>
<p>We describe here our approach to generating an empirical null distribution of <italic>p</italic>-values (Figure <xref ref-type="fig" rid="F4">4</xref>). First, let <italic>Y</italic><sub>1</sub>, <italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, &#x02026;, <italic>Y<sub>n</sub></italic> be simulated phenotypes obtained from permuting the phenotype data. (Typically, <italic>n</italic>&#x02009;&#x0003D;&#x02009;1000.) In case covariates are used, they should be relabeled in sync with the phenotype. For each simulated phenotype, we apply the same triangle method. For each <italic>Y<sub>i</sub></italic>, we derive the set of gene expression traits <italic>g<sub>ij</sub></italic> that meet the threshold, <italic>p</italic>-value&#x02009;&#x0003C;&#x02009;<italic>p</italic><sub>gene-phenotype</sub>, where the associations between the phenotype <italic>Y<sub>i</sub></italic> and gene expression traits are calculated while preserving the correlation structure of all gene expression phenotypes. For each <italic>g<sub>ij</sub></italic>, we retrieve the set of eQTLs, <italic>S<sub>ijk</sub></italic>, associated with the gene at the pre-defined threshold, <italic>p</italic>-value&#x02009;&#x0003C;&#x02009;<italic>p</italic><sub>SNP-gene</sub>. The subset of these eQTL SNPs that satisfy <italic>p</italic>-value&#x02009;&#x0003C;&#x02009;<italic>p</italic><sub>SNP-phenotype</sub> provides a set of <italic>p</italic>-values <inline-formula><mml:math id="M1"><mml:mrow><mml:mrow><mml:mo class="MathClass-open">{</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle class="text"><mml:mtext class="textit" mathvariant="italic">P</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi><mml:msup><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub></mml:mrow><mml:mo class="MathClass-close">}</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, for each simulated phenotype <italic>Y<sub>i</sub></italic>. Note that each such set <inline-formula><mml:math id="M2"><mml:mrow><mml:mrow><mml:mo class="MathClass-open">{</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle class="text"><mml:mtext class="textit" mathvariant="italic">P</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi><mml:msup><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub></mml:mrow><mml:mo class="MathClass-close">}</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> of <italic>p</italic>-values may differ in count between simulated phenotypes. Note that <italic>i</italic> indexes simulations, <italic>j</italic> indexes genes, and <italic>k</italic> indexes eQTLs.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p><bold>Simulating the null distribution</bold>. Simulated phenotypes were obtained by permuting phenotype values. Associations between gene expression traits and phenotype were done conditional on preserved correlation structure between the gene expression traits. The eQTLs for selected genes were retrieved from our eQTL database SCAN. Finally, the eQTLs were tested for association with phenotype. For each of <italic>n</italic>&#x02009;&#x0003D;&#x02009;1000 replicates, a set of SNP association <italic>p</italic>-values is generated.</p></caption>
<graphic xlink:href="fgene-03-00202-g004.tif"/>
</fig>
<p>We utilize these sets of <italic>p</italic>-values derived from simulated phenotypes to estimate the null distribution of <italic>p</italic>-values. Having shown the limitation of the use of the traditional FDR for the integrative triangle method, we derive a simple formula to estimate the FDR using this empirical null distribution.</p>
</sec>
<sec>
<title>Empirical FDR</title>
<p>We closely follow Storey&#x02019;s approach (Storey and Tibshirani, <xref ref-type="bibr" rid="B32">2003</xref>) to estimate the FDR. The difference in our approach is that we do not assume that the null distribution of <italic>p</italic>-values is uniform. Instead, we use the empirical distribution generated by simulating the phenotype and performing the integrative analysis. We define the significance level <italic>t</italic> and reject the null hypothesis of no association for all <italic>p</italic>-values smaller than <italic>t</italic>. We use the actual values in the observed vector of <italic>p</italic>-values as cutoff. Thus, for each <italic>p</italic>-value, <italic>t</italic>, in the observed vector of <italic>p</italic>-values, we compute the FDR of the strategy of rejecting all <italic>p</italic>-values less than or equal to <italic>t</italic>. Let the number of falsely significant SNPs be denoted as <italic>F</italic>(<italic>t</italic>)&#x02009;&#x0003D;&#x02009;&#x00023;{null<italic>p<sub>i</sub></italic>&#x02009;&#x02264;&#x02009;<italic>t</italic>, <italic>i</italic>&#x02009;&#x0003D;&#x02009;1, &#x02026;, <italic>m</italic>} and the number of significant SNPs be denoted as <italic>S</italic>(<italic>t</italic>)&#x02009;&#x0003D;&#x02009;&#x00023;{<italic>p<sub>i</sub></italic>&#x02009;&#x02264;&#x02009;<italic>t</italic>, <italic>i</italic>&#x02009;&#x0003D;&#x02009;1, &#x02026;, <italic>m</italic>} with <italic>m</italic> the total number of SNPs after applying the integrative approach. We estimate the FDR as follows:</p>
<disp-formula id="E1"><mml:math id="M3"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd class="eqnarray-1"><mml:mi>F</mml:mi><mml:mi>D</mml:mi><mml:mi>R</mml:mi><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:mtd><mml:mtd class="eqnarray-2"><mml:mo class="MathClass-rel">=</mml:mo><mml:mi>E</mml:mi><mml:mfenced separators="" open="[" close="]"><mml:mrow><mml:mfrac><mml:mrow><mml:mi>F</mml:mi><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>S</mml:mi><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:mfenced></mml:mtd><mml:mtd class="eqnarray-3"></mml:mtd><mml:mtd class="eqnarray-4"></mml:mtd></mml:mtr><mml:mtr><mml:mtd class="eqnarray-1"></mml:mtd><mml:mtd class="eqnarray-2"><mml:mo class="MathClass-rel">&#x02248;</mml:mo><mml:mfrac><mml:mrow><mml:mi>E</mml:mi><mml:mfenced separators="" open="[" close="]"><mml:mrow><mml:mi>F</mml:mi><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:mrow></mml:mfenced></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mfenced separators="" open="[" close="]"><mml:mrow><mml:mi>S</mml:mi><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:mrow></mml:mfenced></mml:mrow></mml:mfrac></mml:mtd><mml:mtd class="eqnarray-3"></mml:mtd><mml:mtd class="eqnarray-4"><mml:mtext class="eqnarray">(1)</mml:mtext></mml:mtd></mml:mtr><mml:mtr><mml:mtd class="eqnarray-1"></mml:mtd><mml:mtd class="eqnarray-2"><mml:mo class="MathClass-rel">=</mml:mo><mml:mfrac><mml:mrow><mml:mi>m</mml:mi><mml:mi>P</mml:mi><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mi>t</mml:mi><mml:mspace width="0.3em" class="thinspace"/><mml:mstyle class="text"><mml:mtext>and&#x000A0;null</mml:mtext></mml:mstyle></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>m</mml:mi><mml:mi>P</mml:mi><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mtd><mml:mtd class="eqnarray-3"></mml:mtd><mml:mtd class="eqnarray-4"></mml:mtd></mml:mtr><mml:mtr><mml:mtd class="eqnarray-1"></mml:mtd><mml:mtd class="eqnarray-2"><mml:mo class="MathClass-rel">=</mml:mo><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mi>t</mml:mi><mml:mspace width="0.3em" class="thinspace"/><mml:mstyle class="text"><mml:mtext>and&#x000A0;null</mml:mtext></mml:mstyle></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mspace width="0.3em" class="thinspace"/></mml:mtd><mml:mtd class="eqnarray-3"></mml:mtd><mml:mtd class="eqnarray-4"><mml:mtext class="eqnarray">(2)</mml:mtext></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>E</italic>[.] is the expectation operator. The approximate equality in Eq. <xref ref-type="disp-formula" rid="E1">1</xref> is proven by Storey (<xref ref-type="bibr" rid="B31">2003</xref>).</p>
<p>The denominator is estimated using the observed number of significant SNPs <italic>p</italic>&#x02009;&#x02264;&#x02009;<italic>t</italic>,</p>
<disp-formula id="E2"><mml:math id="M4"><mml:mrow><mml:mo>&#x00023;</mml:mo><mml:mrow><mml:mo>{</mml:mo> <mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mtext>obs,&#x02009;</mml:mtext><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02264;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mi>m</mml:mi></mml:mrow> <mml:mo>}</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:mi>m</mml:mi></mml:mrow></mml:math></disp-formula>
<p>The numerator can be factored as <italic>P</italic>(<italic>p</italic>&#x02009;&#x02264;&#x02009;<italic>t</italic> and null)&#x02009;&#x0003D;&#x02009;<italic>P</italic>(<italic>p</italic>&#x02009;&#x02264;&#x02009;<italic>t</italic>&#x02009;|&#x02009;null)&#x000B7;<italic>P</italic>(null). The first factor <italic>P</italic>(<italic>p</italic>&#x02009;&#x02264;&#x02009;<italic>t</italic>&#x02009;|&#x02009;null) is estimated using the empirical distribution: &#x00023;{<italic>p</italic><sub>sim,<italic>i</italic></sub>&#x02009;&#x02264;&#x02009;<italic>t</italic>, <italic>i</italic>&#x02009;&#x0003D;&#x02009;1, &#x02026;, <italic>M</italic><sub>0</sub>}/<italic>M</italic><sub>0</sub> where the <italic>p</italic><sub>sim</sub>&#x02019;s are the <italic>p</italic>-values generated with the simulated phenotypes and <italic>M</italic><sub>0</sub> is the sum (across all simulations, <italic>M</italic><sub>0</sub>&#x02009;&#x0003D;&#x02009;&#x003A3;<italic>m<sub>o,s</sub></italic>, where <italic>m<sub>o,s</sub></italic> corresponds to the number of eQTLs selected after applying the triangle method to the simulated phenotype <italic>Y<sub>s</sub></italic>) of the total number of SNPs selected using the simulated phenotypes. Note that for uniformly distributed <italic>p</italic>-values, we would have <italic>P</italic>(<italic>p</italic>&#x02009;&#x02264;&#x02009;<italic>t</italic>&#x02009;|&#x02009;null)&#x02009;&#x0003D;&#x02009;<italic>t</italic>. We know, however, that when the set of SNPs are derived from the integrative approach, the null <italic>p</italic>-values may not be distributed uniformly, as illustrated in Figure <xref ref-type="fig" rid="F5">5</xref>. The second factor <italic>P</italic>(null) is the proportion of SNPs that are unrelated to the phenotype and may be estimated as the ratio</p>
<disp-formula id="E3"><mml:math id="M5"><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>&#x003C0;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mn>0</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mo>&#x00023;</mml:mo><mml:mrow><mml:mo>{</mml:mo> <mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mtext>obs,&#x02009;</mml:mtext><mml:mi>i</mml:mi><mml:mtext>&#x02009;</mml:mtext></mml:mrow></mml:msub><mml:mo>&#x0003E;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mi>m</mml:mi></mml:mrow> <mml:mo>}</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x00023;</mml:mo><mml:mrow><mml:mo>{</mml:mo> <mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mi>i</mml:mi><mml:mtext>&#x02009;</mml:mtext></mml:mrow></mml:msub><mml:mo>&#x0003E;</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mtext>&#x02009;</mml:mtext><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow> <mml:mo>}</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>
<p>or may be set to 1 to yield a more conservative estimate of FDR.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p><bold>Distribution of <italic>p</italic>-values for all SNPs and triangle method-derived SNPs from actual phenotype as well as simulated phenotypes</bold>. QQ plots are shown for all SNPs as well as the SNPs that come of the triangle method. Gray dots represent the QQ plots for the triangle method-derived SNPs from 1000 simulated phenotypes. Note that the triangle method may yield spurious associations if we rely on the traditional FDR.</p></caption>
<graphic xlink:href="fgene-03-00202-g005.tif"/>
</fig>
<p>In summary, we estimate the FDR based on the empirical distribution as follows:</p>
<disp-formula id="E4"><mml:math id="M6"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd class="eqnarray-1"><mml:mstyle class="text"><mml:mtext>e</mml:mtext></mml:mstyle><mml:mover accent="false"><mml:mrow><mml:mstyle class="text"><mml:mtext>FDR</mml:mtext></mml:mstyle></mml:mrow><mml:mo class="MathClass-op">^</mml:mo></mml:mover><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:mtd><mml:mtd class="eqnarray-2"><mml:mo class="MathClass-rel">=</mml:mo><mml:mfrac><mml:mrow><mml:mi>#</mml:mi><mml:mfenced separators="" open="{" close="}"><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext>sim</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mfenced><mml:mi></mml:mi><mml:msub><mml:mrow><mml:mo>/</mml:mo><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>#</mml:mi><mml:mfenced separators="" open="{" close="}"><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext>obs</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mfenced><mml:mi></mml:mi><mml:mo>/</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:mfrac><mml:mo class="MathClass-punc">.</mml:mo><mml:mfrac><mml:mrow><mml:mi>#</mml:mi><mml:mfenced separators="" open="{" close="}"><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext>obs</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo class="MathClass-rel">&#x0003E;</mml:mo><mml:mi>&#x003BB;</mml:mi></mml:mrow></mml:mfenced><mml:mi></mml:mi><mml:mo>/</mml:mo><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>#</mml:mi><mml:mfenced separators="" open="{" close="}"><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext>sim</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo class="MathClass-rel">&#x0003E;</mml:mo><mml:mi>&#x003BB;</mml:mi></mml:mrow></mml:mfenced><mml:mi></mml:mi><mml:msub><mml:mrow><mml:mo>/</mml:mo><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mtd><mml:mtd class="eqnarray-3"></mml:mtd><mml:mtd class="eqnarray-4"><mml:mtext class="eqnarray">(3)</mml:mtext></mml:mtd></mml:mtr><mml:mtr><mml:mtd class="eqnarray-1"></mml:mtd><mml:mtd class="eqnarray-2"><mml:mo class="MathClass-rel">=</mml:mo><mml:mfrac><mml:mrow><mml:mi>#</mml:mi><mml:mfenced separators="" open="{" close="}"><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext>sim</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mfenced></mml:mrow><mml:mrow><mml:mi>#</mml:mi><mml:mfenced separators="" open="{" close="}"><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext>sim</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mfenced></mml:mrow></mml:mfrac><mml:mo class="MathClass-punc">.</mml:mo><mml:mfrac><mml:mrow><mml:mi>#</mml:mi><mml:mfenced separators="" open="{" close="}"><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext>obs</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo class="MathClass-rel">&#x0003E;</mml:mo><mml:mi>&#x003BB;</mml:mi></mml:mrow></mml:mfenced></mml:mrow><mml:mrow><mml:mi>#</mml:mi><mml:mfenced separators="" open="{" close="}"><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext>sim</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo class="MathClass-rel">&#x0003E;</mml:mo><mml:mi>&#x003BB;</mml:mi></mml:mrow></mml:mfenced></mml:mrow></mml:mfrac></mml:mtd><mml:mtd class="eqnarray-3"></mml:mtd><mml:mtd class="eqnarray-4"><mml:mtext class="eqnarray">(4)</mml:mtext></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>with &#x003BB;&#x02009;&#x0003D;&#x02009;0.5. We can also use the more conservative estimate</p>
<disp-formula id="E5"><mml:math id="M7"><mml:mstyle class="text"><mml:mtext>e</mml:mtext></mml:mstyle><mml:mover accent="false"><mml:mrow><mml:mstyle class="text"><mml:mtext>FDR</mml:mtext></mml:mstyle></mml:mrow><mml:mo class="MathClass-op">^</mml:mo></mml:mover><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mfrac><mml:mrow><mml:mi>#</mml:mi><mml:mfenced separators="" open="{" close="}"><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext>sim</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mfenced><mml:mi></mml:mi><mml:msub><mml:mrow><mml:mo>/</mml:mo><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>#</mml:mi><mml:mfenced separators="" open="{" close="}"><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext>obs</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo class="MathClass-rel">&#x02264;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mfenced><mml:mi></mml:mi><mml:mo>/</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula>
<p>In order to ensure increasing FDR for increasing <italic>p</italic>-values, we define <italic>q</italic>-values as</p>
<disp-formula id="E6"><label>(5)</label><mml:math id="M8"><mml:mover accent="true"><mml:mrow><mml:mi>q</mml:mi></mml:mrow><mml:mo class="MathClass-op">^</mml:mo></mml:mover><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow><mml:mo class="MathClass-rel">=</mml:mo><mml:mstyle class="text"><mml:mtext>mi</mml:mtext></mml:mstyle><mml:msub><mml:mrow><mml:mstyle class="text"><mml:mtext>n</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo class="MathClass-rel">&#x02265;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mstyle class="text"><mml:mtext>e</mml:mtext></mml:mstyle><mml:mover accent="false"><mml:mrow><mml:mstyle class="text"><mml:mtext>FDR</mml:mtext></mml:mstyle></mml:mrow><mml:mo class="MathClass-op">^</mml:mo></mml:mover><mml:mrow><mml:mo class="MathClass-open">(</mml:mo><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mo class="MathClass-close">)</mml:mo></mml:mrow></mml:math></disp-formula>
</sec>
<sec>
<title>Etoposide pharmacogenomics</title>
<p>A triangle method, similar to the one described here, had been originally applied to cellular sensitivity data for etoposide (Huang et al., <xref ref-type="bibr" rid="B16">2007b</xref>), one of the most widely used anti-cancer agents. Using our empirical FDR approach, we re-analyzed the same phenotype data from the original experiments, which had sought to quantify the cytotoxic effect of the drug on the cell lines using a colorimetric-based assay, as previously described (Huang et al., <xref ref-type="bibr" rid="B16">2007b</xref>). We conducted our study on the 90 HapMap cell lines of European descent (CEU). The quantitative trait used here was IC<sub>50</sub>, defined as the concentration required for 50% cell growth inhibition.</p>
</sec>
</sec>
<sec>
<title>Results</title>
<sec>
<title>R function for calculating empirical FDR</title>
<p>We provide an R function for estimating the empirical FDR that can be used once the observed and the simulated <italic>p</italic>-values are generated (<uri xlink:href="http://www.scandb.org/newinterface/empiricalFDR.R">http://www.scandb.org/newinterface/empiricalFDR.R</uri>). The way these <italic>p</italic>-values are generated will depend on the specific integration method used, the eQTL mapping database, and the number of components in the &#x0201C;genetic machinery.&#x0201D;</p>
<sec>
<title>Computation time</title>
<p>For step 1 (see Figure <xref ref-type="fig" rid="F4">4</xref>) we need to compute about 10,000 (the number of transcripts) linear regressions. This can be achieved in a few seconds using R and the fast linear regression computation in R such as implemented by us and made available in <uri xlink:href="http://www.scandb.org/newinterface/fastlm.R">http://www.scandb.org/newinterface/fastlm.R</uri>. For step 2, we only need to query the eQTLs for the new set of genes from step 1, which takes a fraction of a second. For step 3: after applying steps 1 and 2 only a few SNPs are left (typically around 1000 or less). This can also be done in a fraction of a second. Adding up all three steps, the method with 1000 permutations would take a couple of hours of computing time on a typical desktop available in 2012.</p>
</sec>
</sec>
<sec>
<title>Traditional GWAS and SNP selection via eQTLs</title>
<p>The GWAS of etoposide IC<sub>50</sub> did not yield any significant signals, as perhaps expected from the small sample size. Figure <xref ref-type="fig" rid="F5">5</xref> shows a QQ plot of the distribution of <italic>p</italic>-values (as circles). However, we found a highly significant enrichment for gene regulatory signals among the etoposide-associated variants relative to frequency-matched SNPs (Gamazon et al., <xref ref-type="bibr" rid="B7">2010a</xref>). This finding raises the possibility of the use of eQTL annotation to increase the power to detect true associations. We therefore proceeded to incorporate eQTL functional annotation through the integrative triangle method.</p>
</sec>
<sec>
<title>Genetic variation associated with etoposide cytotoxicity identified through the triangle method</title>
<p>Expression levels had been generated by our group on these cell lines for more than 10,000 genes, allowing us to perform associations between etoposide IC<sub>50</sub> and gene expression traits; those genes meeting <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.05 (see Table <xref ref-type="supplementary-material" rid="SM1">S1</xref> in Supplementary Material) were carried forward in the triangular analysis. We then utilized SCAN (Gamazon et al., <xref ref-type="bibr" rid="B8">2010b</xref>), a public repository for the results of our eQTL studies on the HapMap cell lines, to annotate the selected genes showing association with etoposide IC<sub>50</sub> with expression-associated SNPs (<italic>p</italic>&#x02009;&#x0003C;&#x02009;10<sup>&#x02212;4</sup>; see Table <xref ref-type="supplementary-material" rid="SM2">S2</xref> in Supplementary Material). Finally, the selected eQTLs were tested for association with etoposide IC<sub>50</sub>. Figure <xref ref-type="fig" rid="F5">5</xref> shows the QQ plot of association <italic>p</italic>-values for all SNPs, the QQ plot for the final SNP set derived from the triangle method, and the QQ plot for the triangle method-prioritized SNPs from each of 1000 simulated phenotypes. The figure illustrates that certain eQTL SNPs from this triangle method-derived SNP set attained a (traditional) FDR&#x02009;&#x0003C;&#x02009;0.05, but also that the triangle method may yield spurious associations using the traditional FDR.</p>
</sec>
<sec>
<title>Empirical FDR identifies significant associations with cellular sensitivity to etoposide</title>
<p>We applied our proposed empirical FDR method to the observed set of <italic>p</italic>-values from the triangle analysis-derived set of SNPs. To generate an empirical null distribution of <italic>p</italic>-values, we conducted simulations (see <xref ref-type="sec" rid="s1">Materials and Methods</xref>). Table S3 in Supplementary Material lists the most significant etoposide-associated SNPs based on our empirical FDR method. Note the comparison between traditional FDR and eFDR for the most highly ranked SNPs prioritized by the triangle method (based on unadjusted <italic>p</italic>-value), showing that traditional FDR inflates the significance of selected SNPs.</p>
</sec>
</sec>
<sec sec-type="discussion">
<title>Discussion</title>
<p>Integrative approaches to diverse genomics datasets promise to resolve some important biological problems and, perhaps as importantly, generate novel hypotheses. Here we developed a <italic>computationally feasible</italic> permutation method to <italic>quantify the significance</italic> of findings arising from an integrative approach. The triangle method, a highly plausible approach to SNP prioritization and an example of how diverse high-throughput datasets may be integrated, requires an assessment of the resulting findings. This integrative method incorporates genotypic and expression data to identify trait-correlated genes that are under the regulation of eQTLs, yielding a set of candidate SNPs potentially important for the genetic etiology of the trait. Our proposed empirical FDR approach not only takes into account the integrative nature of the triangle method, but the approach also accounts for the correlation structure among gene expression traits and among genotypes. Our empirical FDR approach aims to provide a sound quantification of the significance of the prioritized SNPs from the integrative method.</p>
<p>It should be noted that our approach separates the phenotype from what we are calling the &#x0201C;genetic machinery&#x0201D; (e.g., genotype, gene expression, protein expression, methylation). Only the phenotype is permuted and the relationships within the genetic machinery are preserved. Consequently, we avoid having to perform multiple eQTL mappings (the most computationally costly permutation) because <italic>p</italic>-values in each arm are used for prioritization and not for determining the significance of the associations. Importantly, our approach differs from other approaches wherein the permutation is conducted on each arm of the triangle. In the latter approach, the threshold for significance can be arbitrary or unnecessarily conservative. A well-chosen set of thresholds will determine the performance of the integrative approach. In our method, we provide a measure of significance that is well-calibrated regardless of the set of thresholds used. Furthermore, in contrast to approaches that apply a threshold (e.g., Bonferroni) at each step of the integrative process, our method provides an overall measure of significance for the results of the integrative analysis.</p>
<p>Our quantification approach can easily accommodate hub eQTL analysis (SNPs associated with multiple genes, also referred to as master regulators). In the filtering procedure we require that the SNPs be eQTLs for a number of phenotype-associated gene expression traits. As long as the permutation steps follow the same filtering algorithm as the one used for the observed data, our method will yield the right FDR. Likewise, our method can be applied to both quantitative and binary outcomes.</p>
<p>In this study, we also explored the limitations of the traditional FDR when applied to an integrative approach such as the triangle method. In particular, we found that traditional FDR may yield spurious associations from simulated phenotypes. Furthermore, while the use of eQTL information may improve power to detect true associations, traditional FDR may still inflate the significance of the selected SNPs.</p>
<p>We applied our empirical FDR approach to a study of cellular sensitivity to etoposide. Etoposide is a topoisomerase II inhibitor (Sinha et al., <xref ref-type="bibr" rid="B30">1988</xref>) widely used against lung cancer, non-Hodgkin&#x02019;s lymphoma, myelogenous leukemia, and Kaposi&#x02019;s sarcoma. As in the case of other chemotherapeutic agents, the drug is associated with serious toxicities, including bone marrow suppression, diarrhea, and fatigue as well as treatment-induced acute myeloid leukemia (Mistry et al., <xref ref-type="bibr" rid="B23">2005</xref>). Thus, the identification of predictors of response or potentially debilitating toxicities associated with etoposide, including genetic variations, is key to the implementation of an effective treatment regimen and, longer-term, to the realization of an individualized approach to therapy. Based on cell lines derived from large pedigrees, it has been reported that a significant genetic component contributes to cellular sensitivity to etoposide (Peters et al., <xref ref-type="bibr" rid="B27">2011</xref>).</p>
<p>Here, using our empirical FDR method, we identified 12 SNPs showing significant association (eFDR&#x02009;&#x0003C;&#x02009;0.15) with cellular sensitivity to etoposide through their effect on gene expression. The 12 SNPs represent four independent genomic loci (on chromosome 8q12, 2p24, 10q23, and 16q24), of which the 10q23 SNPs are located in the glutamate receptor ionotropic delta-1 subunit (<italic>GRID1</italic>) gene. The expression target genes of rs9808546 (on chromosome 2) show a highly significant enrichment for <italic>acetylation</italic> [<italic>n</italic>&#x02009;&#x0003D;&#x02009;27, Benjamini&#x02013;Hochberg (BH) FDR&#x02009;&#x0003D;&#x02009;0.0018] and <italic>phosphoprotein</italic> (<italic>n</italic>&#x02009;&#x0003D;&#x02009;51, BH FDR&#x02009;&#x0003D;&#x02009;0.0027; Huang da et al., <xref ref-type="bibr" rid="B17">2009</xref>), consistent with studies that have shown that histone deacetylase inhibitors sensitize cells to the cytotoxic effects (particularly) of topoisomerase II agents such as etoposide (Kurz et al., <xref ref-type="bibr" rid="B19">2001</xref>; Marchion et al., <xref ref-type="bibr" rid="B21">2004</xref>; Hajji et al., <xref ref-type="bibr" rid="B12">2008</xref>, <xref ref-type="bibr" rid="B11">2010</xref>). Importantly, having provided a sound quantification of the significance of the genotype-phenotype associations, the gene expression targets of the identified eQTLs provide a set of candidate genes for functional validation and a plausible mechanism for how the genetic variation may mediate their phenotypic effect.</p>
<p>The R code we provide can be used to compute the empirical FDR for any case in which empirical null <italic>p</italic>-values are available regardless of the method used to generate them. Thus, it should prove useful for other integrative approaches.</p>
<p>In case there are confounding factors that yield more gene expression traits associated with the phenotype, our method yields a conservative estimate of FDR. The effect of the confounders is to increase the number of noisy genes in the first step and consequently to generate more null eQTLs than there should be in the final set. This fact decreases the overall significance of real associations and our method still provides an unbiased estimate of the significance.</p>
<p>In summary, we have developed a computationally feasible approach to assess the significance of genotype-phenotype associations prioritized by an integrative genomic method. As omics datasets become routinely integrated to address important biological problems, the issue our study sought to address becomes increasingly more relevant.</p>
</sec>
<sec>
<title>Conflict of Interest Statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at <uri xlink:href="http://www.frontiersin.org/Statistical_Genetics_and_Methodology/10.3389/fgene.2012.00202/abstract">http://www.frontiersin.org/Statistical_Genetics_and_Methodology/10.3389/fgene.2012.00202/abstract</uri></p>
<supplementary-material xlink:href="30051_Im_DataSheet1.XLSX" id="SM1" mimetype="applicationn/XLSX" xmlns:xlink="http://www.w3.org/1999/xlink">
<label>Supplementary Table S1</label>
<caption><p><bold>Top gene expression-trait correlations</bold>.</p></caption>
</supplementary-material>
<supplementary-material xlink:href="30051_Im_DataSheet2.XLSX" id="SM2" mimetype="applicationn/XLSX" xmlns:xlink="http://www.w3.org/1999/xlink">
<label>Supplementary Table S2</label>
<caption><p><bold>Top SNPs from etoposide GWAS</bold>.</p></caption>
</supplementary-material>
<supplementary-material xlink:href="30051_Im_DataSheet3.XLSX" id="SM3" mimetype="applicationn/XLSX" xmlns:xlink="http://www.w3.org/1999/xlink">
<label>Supplementary Table S3</label>
<caption><p><bold>Top associations from the eFDR approach</bold>.</p></caption>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<p>This work was funded in part by PAAR (Pharmacogenomics of Anti-cancer Agents Research; U01 GM61393), by the University of Chicago Cancer Center Support grant P30 CA014599-36 from the National Cancer Institute, and by the Genotype-Tissue Expression project (GTeX; R01 MH090937). R. Stephanie Huang received support from NIH/NIGMS grant K08GM089941, University of Chicago Cancer Center Support Grant (&#x00023;P30 CA14599), and Breast Cancer SPORE Career Development Award.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cheung</surname> <given-names>V. G.</given-names></name> <name><surname>Conlin</surname> <given-names>L. K.</given-names></name> <name><surname>Weber</surname> <given-names>T. M.</given-names></name> <name><surname>Arcaro</surname> <given-names>M.</given-names></name> <name><surname>Jen</surname> <given-names>K. Y.</given-names></name> <name><surname>Morley</surname> <given-names>M.</given-names></name> <etal/></person-group> (<year>2003</year>). <article-title>Natural variation in human gene expression assessed in lymphoblastoid cells</article-title>. <source>Nat. Genet.</source> <volume>33</volume>, <fpage>422</fpage>&#x02013;<lpage>425</lpage>.<pub-id pub-id-type="doi">10.1038/ng1094</pub-id><pub-id pub-id-type="pmid">12567189</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Correa</surname> <given-names>C. R.</given-names></name> <name><surname>Cheung</surname> <given-names>V. G.</given-names></name></person-group> (<year>2004</year>). <article-title>Genetic variation in radiation-induced expression phenotypes</article-title>. <source>Am. J. Hum. Genet.</source> <volume>75</volume>, <fpage>885</fpage>&#x02013;<lpage>890</lpage>.<pub-id pub-id-type="doi">10.1086/425221</pub-id><pub-id pub-id-type="pmid">15359380</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Degner</surname> <given-names>J. F.</given-names></name> <name><surname>Pai</surname> <given-names>A. A.</given-names></name> <name><surname>Pique-Regi</surname> <given-names>R.</given-names></name> <name><surname>Veyrieras</surname> <given-names>J. B.</given-names></name> <name><surname>Gaffney</surname> <given-names>D. J.</given-names></name> <name><surname>Pickrell</surname> <given-names>J. K.</given-names></name> <etal/></person-group> (<year>2012</year>). <article-title>DNase I sensitivity QTLs are a major determinant of human expression variation</article-title>. <source>Nature</source> <volume>482</volume>, <fpage>390</fpage>&#x02013;<lpage>394</lpage>.<pub-id pub-id-type="doi">10.1038/nature10808</pub-id><pub-id pub-id-type="pmid">22307276</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dermitzakis</surname> <given-names>E. T.</given-names></name></person-group> (<year>2012</year>). <article-title>Cellular genomics for complex traits</article-title>. <source>Nat. Rev. Genet.</source> <volume>13</volume>, <fpage>215</fpage>&#x02013;<lpage>220</lpage>.<pub-id pub-id-type="pmid">22330769</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Duan</surname> <given-names>S.</given-names></name> <name><surname>Huang</surname> <given-names>R. S.</given-names></name> <name><surname>Zhang</surname> <given-names>W.</given-names></name> <name><surname>Bleibel</surname> <given-names>W. K.</given-names></name> <name><surname>Roe</surname> <given-names>C. A.</given-names></name> <name><surname>Clark</surname> <given-names>T. A.</given-names></name></person-group> (<year>2008</year>). <article-title>Genetic architecture of transcript-level variation in humans</article-title>. <source>Am. J. Hum. Genet.</source> <volume>82</volume>, <fpage>1101</fpage>&#x02013;<lpage>1113</lpage>.<pub-id pub-id-type="doi">10.1016/j.ajhg.2008.03.006</pub-id><pub-id pub-id-type="pmid">18439551</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Farnham</surname> <given-names>P. J.</given-names></name></person-group> (<year>2009</year>). <article-title>Insights from genomic profiling of transcription factors</article-title>. <source>Nat. Rev. Genet.</source> <volume>10</volume>, <fpage>605</fpage>&#x02013;<lpage>616</lpage>.<pub-id pub-id-type="doi">10.1038/nrg2636</pub-id><pub-id pub-id-type="pmid">19668247</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gamazon</surname> <given-names>E. R.</given-names></name> <name><surname>Huang</surname> <given-names>R. S.</given-names></name> <name><surname>Cox</surname> <given-names>N. J.</given-names></name> <name><surname>Dolan</surname> <given-names>M. E.</given-names></name></person-group> (<year>2010a</year>). <article-title>Chemotherapeutic drug susceptibility associated SNPs are enriched in expression quantitative trait loci</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>107</volume>, <fpage>9287</fpage>&#x02013;<lpage>9292</lpage>.<pub-id pub-id-type="doi">10.1073/pnas.1001827107</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gamazon</surname> <given-names>E. R.</given-names></name> <name><surname>Zhang</surname> <given-names>W.</given-names></name> <name><surname>Konkashbaev</surname> <given-names>A.</given-names></name> <name><surname>Duan</surname> <given-names>S.</given-names></name> <name><surname>Kistner</surname> <given-names>E. O.</given-names></name> <name><surname>Nicolae</surname> <given-names>D. L.</given-names></name> <etal/></person-group> (<year>2010b</year>). <article-title>SCAN: SNP and copy number annotation</article-title>. <source>Bioinformatics</source> <volume>26</volume>, <fpage>259</fpage>&#x02013;<lpage>262</lpage>.<pub-id pub-id-type="doi">10.1093/bioinformatics/btp644</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gamazon</surname> <given-names>E. R.</given-names></name> <name><surname>Ziliak</surname> <given-names>D.</given-names></name> <name><surname>Im</surname> <given-names>H. K.</given-names></name> <name><surname>LaCroix</surname> <given-names>B.</given-names></name> <name><surname>Park</surname> <given-names>D. S.</given-names></name> <name><surname>Cox</surname> <given-names>N. J.</given-names></name> <name><surname>Huang</surname> <given-names>R. S.</given-names></name></person-group> (<year>2012</year>). <article-title>Genetic architecture of microRNA expression: implications for the transcriptome and complex traits</article-title>. <source>Am. J. Hum. Genet.</source> <volume>90</volume>, <fpage>1046</fpage>&#x02013;<lpage>1063</lpage>.<pub-id pub-id-type="doi">10.1016/j.ajhg.2012.04.023</pub-id><pub-id pub-id-type="pmid">22658545</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Garge</surname> <given-names>N.</given-names></name> <name><surname>Pan</surname> <given-names>H.</given-names></name> <name><surname>Rowland</surname> <given-names>M. D.</given-names></name> <name><surname>Cargile</surname> <given-names>B. J.</given-names></name> <name><surname>Zhang</surname> <given-names>X.</given-names></name> <name><surname>Cooley</surname> <given-names>P. C.</given-names></name> <etal/></person-group> (<year>2010</year>). <article-title>Identification of quantitative trait loci underlying proteome variation in human lymphoblastoid cells</article-title>. <source>Mol. Cell Proteomics</source> <volume>9</volume>, <fpage>1383</fpage>&#x02013;<lpage>1399</lpage>.<pub-id pub-id-type="doi">10.1074/mcp.M900378-MCP200</pub-id><pub-id pub-id-type="pmid">20179311</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hajji</surname> <given-names>N.</given-names></name> <name><surname>Wallenborg</surname> <given-names>K.</given-names></name> <name><surname>Vlachos</surname> <given-names>P.</given-names></name> <name><surname>F&#x000FC;llgrabe</surname> <given-names>J.</given-names></name> <name><surname>Hermanson</surname> <given-names>O.</given-names></name> <name><surname>Joseph</surname> <given-names>B.</given-names></name></person-group> (<year>2010</year>). <article-title>Opposing effects of hMOF and SIRT1 on H4K16 acetylation and the sensitivity to the topoisomerase II inhibitor etoposide</article-title>. <source>Oncogene</source> <volume>29</volume>, <fpage>2192</fpage>&#x02013;<lpage>2204</lpage>.<pub-id pub-id-type="doi">10.1038/onc.2009.505</pub-id><pub-id pub-id-type="pmid">20118981</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hajji</surname> <given-names>N.</given-names></name> <name><surname>Wallenborg</surname> <given-names>K.</given-names></name> <name><surname>Vlachos</surname> <given-names>P.</given-names></name> <name><surname>Nyman</surname> <given-names>U.</given-names></name> <name><surname>Hermanson</surname> <given-names>O.</given-names></name> <name><surname>Joseph</surname> <given-names>B.</given-names></name></person-group> (<year>2008</year>). <article-title>Combinatorial action of the HDAC inhibitor trichostatin A and etoposide induces caspase-mediated AIF-dependent apoptotic cell death in non-small cell lung carcinoma cells</article-title>. <source>Oncogene</source> <volume>27</volume>, <fpage>3134</fpage>&#x02013;<lpage>3144</lpage>.<pub-id pub-id-type="doi">10.1038/sj.onc.1210976</pub-id><pub-id pub-id-type="pmid">18071312</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hawkins</surname> <given-names>R. D.</given-names></name> <name><surname>Hon</surname> <given-names>G. C.</given-names></name> <name><surname>Ren</surname> <given-names>B.</given-names></name></person-group> (<year>2010</year>). <article-title>Next-generation genomics: an integrative approach</article-title>. <source>Nat. Rev. Genet.</source> <volume>11</volume>, <fpage>476</fpage>&#x02013;<lpage>486</lpage>.<pub-id pub-id-type="pmid">20531367</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hindorff</surname> <given-names>L. A.</given-names></name> <name><surname>Sethupathy</surname> <given-names>P.</given-names></name> <name><surname>Junkins</surname> <given-names>H. A.</given-names></name> <name><surname>Ramos</surname> <given-names>E. M.</given-names></name> <name><surname>Mehta</surname> <given-names>J. P.</given-names></name> <name><surname>Collins</surname> <given-names>F. S.</given-names></name> <etal/></person-group> (<year>2009</year>). <article-title>Potential etiologic, and functional implications of genome-wide association loci for human diseases, and traits</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>106</volume>, <fpage>9362</fpage>&#x02013;<lpage>9367</lpage>.<pub-id pub-id-type="doi">10.1073/pnas.0903103106</pub-id><pub-id pub-id-type="pmid">19474294</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>R. S.</given-names></name> <name><surname>Duan</surname> <given-names>S.</given-names></name> <name><surname>Shukla</surname> <given-names>S. J.</given-names></name> <name><surname>Kistner</surname> <given-names>E. O.</given-names></name> <name><surname>Clark</surname> <given-names>T. A.</given-names></name> <name><surname>Chen</surname> <given-names>T. X.</given-names></name> <etal/></person-group> (<year>2007a</year>). <article-title>Identification of genetic variants contributing to cisplatin-induced cytotoxicity by use of a genomewide approach</article-title>. <source>Am. J. Hum. Genet.</source> <volume>81</volume>, <fpage>427</fpage>&#x02013;<lpage>437</lpage>.<pub-id pub-id-type="doi">10.1086/519850</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>R. S.</given-names></name> <name><surname>Duan</surname> <given-names>S.</given-names></name> <name><surname>Bleibel</surname> <given-names>W. K.</given-names></name> <name><surname>Kistner</surname> <given-names>E. O.</given-names></name> <name><surname>Zhang</surname> <given-names>W.</given-names></name> <name><surname>Clark</surname> <given-names>T. A.</given-names></name> <etal/></person-group> (<year>2007b</year>). <article-title>A genome-wide approach to identify genetic variants that contribute to etoposide-induced cytotoxicity</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>104</volume>, <fpage>9758</fpage>&#x02013;<lpage>9763</lpage>.<pub-id pub-id-type="doi">10.1073/pnas.0609717104</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang da</surname> <given-names>W.</given-names></name> <name><surname>Sherman</surname> <given-names>B. T.</given-names></name> <name><surname>Lempicki</surname> <given-names>R. A.</given-names></name></person-group> (<year>2009</year>). <article-title>Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources</article-title>. <source>Nat. Protoc.</source> <volume>4</volume>, <fpage>44</fpage>&#x02013;<lpage>57</lpage>.<pub-id pub-id-type="doi">10.1038/nprot.2008.211</pub-id><pub-id pub-id-type="pmid">19131956</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ko</surname> <given-names>D. C.</given-names></name> <name><surname>Shukla</surname> <given-names>K. P.</given-names></name> <name><surname>Fong</surname> <given-names>C.</given-names></name> <name><surname>Wasnick</surname> <given-names>M.</given-names></name> <name><surname>Brittnacher</surname> <given-names>M. J.</given-names></name> <name><surname>Wurfel</surname> <given-names>M. M.</given-names></name> <etal/></person-group> (<year>2009</year>). <article-title>A genome-wide in vitro bacterial-infection screen reveals human variation in the host response associated with inflammatory disease</article-title>. <source>Am. J. Hum. Genet.</source> <volume>85</volume>, <fpage>214</fpage>&#x02013;<lpage>227</lpage>.<pub-id pub-id-type="doi">10.1016/j.ajhg.2009.07.012</pub-id><pub-id pub-id-type="pmid">19664744</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kurz</surname> <given-names>E. U.</given-names></name> <name><surname>Wilson</surname> <given-names>S. E.</given-names></name> <name><surname>Leader</surname> <given-names>K. B.</given-names></name> <name><surname>Sampey</surname> <given-names>B. P.</given-names></name> <name><surname>Allan</surname> <given-names>W. P.</given-names></name> <name><surname>Yalowich</surname> <given-names>J. C.</given-names></name> <etal/></person-group> (<year>2001</year>). <article-title>The histone deacetylase inhibitor sodium butyrate induces DNA topoisomerase II alpha expression and confers hypersensitivity to etoposide in human leukemic cell lines</article-title>. <source>Mol. Cancer Ther.</source> <volume>1</volume>, <fpage>121</fpage>&#x02013;<lpage>131</lpage>.<pub-id pub-id-type="pmid">12467229</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Laird</surname> <given-names>P. W.</given-names></name></person-group> (<year>2010</year>). <article-title>Principles and challenges of genomewide DNA methylation analysis</article-title>. <source>Nat. Rev. Genet.</source> <volume>11</volume>, <fpage>191</fpage>&#x02013;<lpage>203</lpage>.<pub-id pub-id-type="doi">10.1038/ni0310-191</pub-id><pub-id pub-id-type="pmid">20125086</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marchion</surname> <given-names>D. C.</given-names></name> <name><surname>Bicaku</surname> <given-names>E.</given-names></name> <name><surname>Daud</surname> <given-names>A. I.</given-names></name> <name><surname>Richon</surname> <given-names>V.</given-names></name> <name><surname>Sullivan</surname> <given-names>D. M.</given-names></name> <name><surname>Munster</surname> <given-names>P. N.</given-names></name></person-group> (<year>2004</year>). <article-title>Sequence-specific potentiation of topoisomerase II inhibitors by the histone deacetylase inhibitor suberoylanilide hydroxamic acid</article-title>. <source>J. Cell. Biochem.</source> <volume>92</volume>, <fpage>223</fpage>&#x02013;<lpage>237</lpage>.<pub-id pub-id-type="doi">10.1002/jcb.20045</pub-id><pub-id pub-id-type="pmid">15108350</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Metzker</surname> <given-names>M. L.</given-names></name></person-group> (<year>2010</year>). <article-title>Sequencing technologies &#x02013; the next generation</article-title>. <source>Nat. Rev. Genet.</source> <volume>11</volume>, <fpage>31</fpage>&#x02013;<lpage>46</lpage>.<pub-id pub-id-type="doi">10.1038/nrg2626</pub-id><pub-id pub-id-type="pmid">19997069</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mistry</surname> <given-names>A. R.</given-names></name> <name><surname>Felix</surname> <given-names>C. A.</given-names></name> <name><surname>Whitmarsh</surname> <given-names>R. J.</given-names></name> <name><surname>Mason</surname> <given-names>A.</given-names></name> <name><surname>Reiter</surname> <given-names>A.</given-names></name> <name><surname>Cassinat</surname> <given-names>B.</given-names></name> <etal/></person-group> (<year>2005</year>). <article-title>DNA topoisomerase II in therapy-related acute promyelocytic leukemia</article-title>. <source>N. Engl. J. Med.</source> <volume>352</volume>, <fpage>1529</fpage>&#x02013;<lpage>1538</lpage>.<pub-id pub-id-type="doi">10.1056/NEJMoa042715</pub-id><pub-id pub-id-type="pmid">15829534</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Montgomery</surname> <given-names>S. B.</given-names></name> <name><surname>Sammeth</surname> <given-names>M.</given-names></name> <name><surname>Gutierrez-Arcelus</surname> <given-names>M.</given-names></name> <name><surname>Lach</surname> <given-names>R. P.</given-names></name> <name><surname>Ingle</surname> <given-names>C.</given-names></name> <name><surname>Nisbett</surname> <given-names>J.</given-names></name> <etal/></person-group> (<year>2010</year>). <article-title>Transcriptome genetics using second generation sequencing in a Caucasian population</article-title>. <source>Nature</source> <volume>464</volume>, <fpage>773</fpage>&#x02013;<lpage>777</lpage>.<pub-id pub-id-type="doi">10.1038/nature08903</pub-id><pub-id pub-id-type="pmid">20220756</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nica</surname> <given-names>A. C.</given-names></name> <name><surname>Montgomery</surname> <given-names>S. B.</given-names></name> <name><surname>Dimas</surname> <given-names>A. S.</given-names></name> <name><surname>Stranger</surname> <given-names>B. E.</given-names></name> <name><surname>Beazley</surname> <given-names>C.</given-names></name> <name><surname>Barroso</surname> <given-names>I.</given-names></name> <etal/></person-group> (<year>2010</year>). <article-title>Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations</article-title>. <source>PLoS Genet.</source> <volume>6</volume>:<fpage>e1000895</fpage>.<pub-id pub-id-type="doi">10.1371/journal.pgen.1000895</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nicolae</surname> <given-names>D. L.</given-names></name> <name><surname>Gamazon</surname> <given-names>E.</given-names></name> <name><surname>Zhang</surname> <given-names>W.</given-names></name> <name><surname>Duan</surname> <given-names>S.</given-names></name> <name><surname>Dolan</surname> <given-names>M. E.</given-names></name> <name><surname>Cox</surname> <given-names>N. J.</given-names></name></person-group> (<year>2010</year>). <article-title>Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS</article-title>. <source>PLoS Genet.</source> <volume>6</volume>:<fpage>e1000888</fpage>.<pub-id pub-id-type="doi">10.1371/journal.pgen.1000888</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Peters</surname> <given-names>E. J.</given-names></name> <name><surname>Motsinger-Reif</surname> <given-names>A.</given-names></name> <name><surname>Havener</surname> <given-names>T. M.</given-names></name> <name><surname>Everitt</surname> <given-names>L.</given-names></name> <name><surname>Hardison</surname> <given-names>N. E.</given-names></name> <name><surname>Watson</surname> <given-names>V. G.</given-names></name> <etal/></person-group> (<year>2011</year>). <article-title>Pharmacogenomic characterization of US FDA-approved cytotoxic drugs</article-title>. <source>Pharmacogenomics</source> <volume>12</volume>, <fpage>1407</fpage>&#x02013;<lpage>1415</lpage>.<pub-id pub-id-type="doi">10.2217/pgs.10.211</pub-id><pub-id pub-id-type="pmid">22008047</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pickrell</surname> <given-names>J. K.</given-names></name> <name><surname>Marioni</surname> <given-names>J. C.</given-names></name> <name><surname>Pai</surname> <given-names>A. A.</given-names></name> <name><surname>Degner</surname> <given-names>J. F.</given-names></name> <name><surname>Engelhardt</surname> <given-names>B. E.</given-names></name> <name><surname>Nkadori</surname> <given-names>E.</given-names></name> <etal/></person-group> (<year>2010</year>). <article-title>Understanding mechanisms underlying human gene expression variation with RNA sequencing</article-title>. <source>Nature</source> <volume>464</volume>, <fpage>768</fpage>&#x02013;<lpage>772</lpage>.<pub-id pub-id-type="doi">10.1038/nature08872</pub-id><pub-id pub-id-type="pmid">20220758</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schadt</surname> <given-names>E. E.</given-names></name> <name><surname>Molony</surname> <given-names>C.</given-names></name> <name><surname>Chudin</surname> <given-names>E.</given-names></name> <name><surname>Hao</surname> <given-names>K.</given-names></name> <name><surname>Yang</surname> <given-names>X.</given-names></name> <name><surname>Lum</surname> <given-names>P. Y.</given-names></name> <etal/></person-group> (<year>2008</year>). <article-title>Mapping the genetic architecture of gene expression in human liver</article-title>. <source>PLoS Biol.</source> <volume>6</volume>:<fpage>e107</fpage>.<pub-id pub-id-type="doi">10.1371/journal.pbio.0060107</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sinha</surname> <given-names>B. K.</given-names></name> <name><surname>Haim</surname> <given-names>N.</given-names></name> <name><surname>Dusre</surname> <given-names>L.</given-names></name> <name><surname>Kerrigan</surname> <given-names>D.</given-names></name> <name><surname>Pommier</surname> <given-names>Y.</given-names></name></person-group> (<year>1988</year>). <article-title>DNA strand breaks produced by etoposide (VP-16,213) in sensitive and resistant human breast tumor cells: implications for the mechanism of action</article-title>. <source>Cancer Res.</source> <volume>48</volume>, <fpage>5096</fpage>&#x02013;<lpage>5100</lpage>.<pub-id pub-id-type="pmid">2842045</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Storey</surname> <given-names>J. D.</given-names></name></person-group> (<year>2003</year>). <article-title>The positive false discovery rate: a Bayesian interpretation and the q-value</article-title>. <source>Ann. Stat.</source> <volume>31</volume>, <fpage>2013</fpage>&#x02013;<lpage>2035</lpage>.<pub-id pub-id-type="doi">10.1214/aos/1074290335</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Storey</surname> <given-names>J. D.</given-names></name> <name><surname>Tibshirani</surname> <given-names>R.</given-names></name></person-group> (<year>2003</year>). <article-title>Statistical significance for genomewide studies</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>100</volume>, <fpage>9440</fpage>&#x02013;<lpage>9445</lpage>.<pub-id pub-id-type="doi">10.1073/pnas.1530509100</pub-id><pub-id pub-id-type="pmid">12883005</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stranger</surname> <given-names>B. E.</given-names></name> <name><surname>Forrest</surname> <given-names>M. S.</given-names></name> <name><surname>Dunning</surname> <given-names>M.</given-names></name> <name><surname>Ingle</surname> <given-names>C. E.</given-names></name> <name><surname>Beazley</surname> <given-names>C.</given-names></name> <name><surname>Thorne</surname> <given-names>N.</given-names></name> <etal/></person-group> (<year>2007a</year>). <article-title>Relative impact of nucleotide, and copy number variation on gene expression phenotypes</article-title>. <source>Science</source> <volume>315</volume>, <fpage>848</fpage>&#x02013;<lpage>853</lpage>.<pub-id pub-id-type="doi">10.1126/science.1136678</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stranger</surname> <given-names>B. E.</given-names></name> <name><surname>Nica</surname> <given-names>A. C.</given-names></name> <name><surname>Forrest</surname> <given-names>M. S.</given-names></name> <name><surname>Dimas</surname> <given-names>A.</given-names></name> <name><surname>Bird</surname> <given-names>C. P.</given-names></name> <name><surname>Beazley</surname> <given-names>C.</given-names></name> <etal/></person-group> (<year>2007b</year>). <article-title>Population genomics of human gene expression</article-title>. <source>Nat. Genet.</source> <volume>39</volume>, <fpage>1217</fpage>&#x02013;<lpage>1224</lpage>.<pub-id pub-id-type="doi">10.1038/ng2142</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stranger</surname> <given-names>B. E.</given-names></name> <name><surname>Montgomery</surname> <given-names>S. B.</given-names></name> <name><surname>Dimas</surname> <given-names>A. S.</given-names></name> <name><surname>Parts</surname> <given-names>L.</given-names></name> <name><surname>Stegle</surname> <given-names>O.</given-names></name> <name><surname>Ingle</surname> <given-names>C. E.</given-names></name> <etal/></person-group> (<year>2012</year>). <article-title>Patterns of cis regulatory variation in diverse human populations</article-title>. <source>PLoS Genet.</source> <volume>8</volume>:<fpage>e1002639</fpage>.<pub-id pub-id-type="doi">10.1371/journal.pgen.1002639</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Z.</given-names></name> <name><surname>Gerstein</surname> <given-names>M.</given-names></name> <name><surname>Snyder</surname> <given-names>M.</given-names></name></person-group> (<year>2009</year>). <article-title>RNA-Seq: a revolutionary tool for transcriptomics</article-title>. <source>Nat. Rev. Genet.</source> <volume>10</volume>, <fpage>57</fpage>&#x02013;<lpage>63</lpage>.<pub-id pub-id-type="doi">10.1038/nrg2484</pub-id><pub-id pub-id-type="pmid">19015660</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Welsh</surname> <given-names>M.</given-names></name> <name><surname>Mangravite</surname> <given-names>L.</given-names></name> <name><surname>Medina</surname> <given-names>M. W.</given-names></name> <name><surname>Tantisira</surname> <given-names>K.</given-names></name> <name><surname>Zhang</surname> <given-names>W.</given-names></name> <name><surname>Huang</surname> <given-names>R. S.</given-names></name> <etal/></person-group> (<year>2009</year>). <article-title>Pharmacogenomic discovery using cell-based models</article-title>. <source>Pharmacol. Rev.</source> <volume>61</volume>, <fpage>413</fpage>&#x02013;<lpage>429</lpage>.<pub-id pub-id-type="doi">10.1124/pr.109.001461</pub-id><pub-id pub-id-type="pmid">20038569</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhong</surname> <given-names>H.</given-names></name> <name><surname>Beaulaurier</surname> <given-names>J.</given-names></name> <name><surname>Lum</surname> <given-names>P. Y.</given-names></name> <name><surname>Molony</surname> <given-names>C.</given-names></name> <name><surname>Yang</surname> <given-names>X.</given-names></name> <name><surname>Macneil</surname> <given-names>D. J.</given-names></name> <etal/></person-group> (<year>2010</year>). <article-title>Liver and adipose expression associated SNPs are enriched for association to type 2 diabetes</article-title>. <source>PLoS Genet.</source> <volume>6</volume>:<fpage>e1000932</fpage>.<pub-id pub-id-type="doi">10.1371/journal.pgen.1000932</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>V. W.</given-names></name> <name><surname>Goren</surname> <given-names>A.</given-names></name> <name><surname>Bernstein</surname> <given-names>B. E.</given-names></name></person-group> (<year>2011</year>). <article-title>Charting histone modifications and the functional organization of mammalian genomes</article-title>. <source>Nat. Rev. Genet.</source> <volume>12</volume>, <fpage>7</fpage>&#x02013;<lpage>18</lpage>.<pub-id pub-id-type="doi">10.1038/nri3147</pub-id><pub-id pub-id-type="pmid">21116306</pub-id></citation></ref>
</ref-list>
</back>
</article>
