<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Genet.</journal-id>
<journal-title>Frontiers in Genetics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Genet.</abbrev-journal-title>
<issn pub-type="epub">1664-8021</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fgene.2020.00053</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genetics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>epiCOLOC: Integrating Large-Scale and Context-Dependent Epigenomics Features for Comprehensive Colocalization Analysis</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Zhou</surname>
<given-names>Yao</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/809628"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sun</surname>
<given-names>Yongzheng</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Huang</surname>
<given-names>Dandan</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Li</surname>
<given-names>Mulin Jun</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/733107"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University</institution>, <addr-line>Tianjin</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Collaborative Innovation Center of Tianjin for Medical Epigenetics, Tianjin Key Laboratory of Medical Epigenetics, Tianjin Medical University</institution>, <addr-line>Tianjin</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Geir Kjetil Sandve, University of Oslo, Oslo, Norway</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Mikhail Dozmorov, Virginia Commonwealth University, Richmond, United States; Enrique Medina-Acosta, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Brazil</p>
</fn>
<fn fn-type="corresp" id="fn001">
<p>*Correspondence: Mulin Jun Li, <email xlink:href="mailto:mulinli@connect.hku.hk">mulinli@connect.hku.hk</email>
</p>
</fn>
<fn fn-type="other" id="fn002">
<p>This article was submitted to Bioinformatics and Computational Biology, a section of the journal Frontiers in Genetics</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>12</day>
<month>02</month>
<year>2020</year>
</pub-date>
<pub-date pub-type="collection">
<year>2020</year>
</pub-date>
<volume>11</volume>
<elocation-id>53</elocation-id>
<history>
<date date-type="received">
<day>08</day>
<month>09</month>
<year>2019</year>
</date>
<date date-type="accepted">
<day>17</day>
<month>01</month>
<year>2020</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2020 Zhou, Sun, Huang and Li</copyright-statement>
<copyright-year>2020</copyright-year>
<copyright-holder>Zhou, Sun, Huang and Li</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>High-throughput genome-wide epigenomic assays, such as ChIP-seq, DNase-seq and ATAC-seq, have profiled a huge number of functional elements across numerous human tissues/cell types, which provide an unprecedented opportunity to interpret human genome and disease in context-dependent manner. Colocalization analysis determines whether genomic features are functionally related to a given search and will facilitate identifying the underlying biological functions characterizing intricate relationships with queries for genomic regions. Existing colocalization methods leveraged diverse assumptions and background models to assess the significance of enrichment, however, they only provided limited and predefined sets of epigenomic features. Here, we comprehensively collected and integrated over 44,385 bulk or single-cell epigenomic assays across 53 human tissues/cell types, such as transcription factor binding, histone modification, open chromatin and transcriptional event. By classifying these profiles into hierarchy of tissue/cell type, we developed a web portal, epiCOLOC (<uri xlink:href="http://mulinlab.org/epicoloc">http://mulinlab.org/epicoloc</uri> or <uri xlink:href="http://mulinlab.tmu.edu.cn/epicoloc">http://mulinlab.tmu.edu.cn/epicoloc</uri>), for users to perform context-dependent colocalization analysis in a convenient way.</p>
</abstract>
<kwd-group>
<kwd>colocalization</kwd>
<kwd>epigenomics and epigenetics</kwd>
<kwd>functional annotation analysis</kwd>
<kwd>genetic variants</kwd>
<kwd>cell type specific</kwd>
<kwd>web server</kwd>
</kwd-group>
<contract-sponsor id="cn001">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content>
</contract-sponsor>
<counts>
<fig-count count="2"/>
<table-count count="0"/>
<equation-count count="0"/>
<ref-count count="43"/>
<page-count count="8"/>
<word-count count="3323"/>
</counts>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro">
<title>Introduction</title>
<p>The epigenome, beyond genome sequence, has been increasingly recognized as key component in the gene regulation to drive certain biological processes and associate with many human diseases (<xref ref-type="bibr" rid="B22">Lawrence et&#xa0;al., 2016</xref>; <xref ref-type="bibr" rid="B9">Dor and Cedar, 2018</xref>; <xref ref-type="bibr" rid="B14">Feinberg, 2018</xref>). In the past decades, high-throughput epigenomic sequencing assays have profiled large numbers of functional elements across numerous human tissues/cell types, such as histone modification, DNA methylation, open chromatin, transcription factor binding site (TFBS), etc. The International Human Epigenome Consortium (IHEC) project (<xref ref-type="bibr" rid="B3">Bujold et&#xa0;al., 2016</xref>) have been initialized, across different countries and consortiums, to coordinate the production of reference maps of human epigenomes for key cellular states relevant to health and diseases. These unprecedented growths of epigenetic profiles and following comprehensive analysis of tissue/cell type-specific epigenomes will ultimately lead to a better understanding of how human population and genome function are shaped in response to the environment (<xref ref-type="bibr" rid="B12">Egtex, 2017</xref>).</p>
<p>To facilitate convenient and accurate utilization of increasing volume of epigenomic data, several commonly-used resources have uniformly processed raw profiles and made them easily accessible, including ENCODE (<xref ref-type="bibr" rid="B7">Consortium, 2012</xref>), Roadmap Epigenomics (<xref ref-type="bibr" rid="B30">Roadmap Epigenomics et&#xa0;al., 2015</xref>), Blueprint Epigenome (<xref ref-type="bibr" rid="B34">Stunnenberg et&#xa0;al., 2016</xref>) and CistromeDB (<xref ref-type="bibr" rid="B26">Mei et&#xa0;al., 2017</xref>; <xref ref-type="bibr" rid="B42">Zheng et&#xa0;al., 2019</xref>). Furthermore, comprehensive epigenomics accumulation has motivated novel computational methods of modelling functional elements across many tissues/cell types, such as ChromHMM (<xref ref-type="bibr" rid="B30">Roadmap Epigenomics et&#xa0;al., 2015</xref>) and Segway (<xref ref-type="bibr" rid="B24">Libbrecht et&#xa0;al., 2019</xref>). Therefore, integrating such large-scale and context-dependent epigenomics features for novel biological findings is in urgent demand (<xref ref-type="bibr" rid="B11">Dozmorov, 2017</xref>; <xref ref-type="bibr" rid="B4">Cazaly et&#xa0;al., 2019</xref>). To this end, colocalization analysis was frequently used to study the interplay of various functional elements in different biological processes and conditions, where potential enrichment of a given genomic/epigenomic profile in pre-defined dataset could be drawn from the global perspective (<xref ref-type="bibr" rid="B21">Kanduri et&#xa0;al., 2019</xref>). Integrated with large-scale tissue/cell type-specific epigenomics data, colocalization analysis provides a powerful avenue to investigate biological relations and cell type specificities, such as identifying co-occurrence of transcription regulators (<xref ref-type="bibr" rid="B39">Yan et&#xa0;al., 2013</xref>) and inferring causal tissues/cell types from disease-associated variants identified by genome-wide association study (GWAS) (<xref ref-type="bibr" rid="B13">Farh et&#xa0;al., 2015</xref>).</p>
<p>Many colocalization tools have been developed by holding diverse assumptions and background models to assess the significance of enrichment. For instances, GSuite HyperBrowser is a web-based tool that performs colocalization analysis using either analytical approaches or Monte Carlo simulations (<xref ref-type="bibr" rid="B32">Simovski et&#xa0;al., 2017</xref>). LOLA utilizes Fisher's exact test based on universe regions to inspect enrichment and provides a web-based portal LOLAweb (<xref ref-type="bibr" rid="B31">Sheffield and Bock, 2016</xref>; <xref ref-type="bibr" rid="B27">Nagraj et&#xa0;al., 2018</xref>). GoShifter (<xref ref-type="bibr" rid="B35">Trynka et&#xa0;al., 2015</xref>) and GARFIELD (<xref ref-type="bibr" rid="B20">Iotchkova et&#xa0;al., 2019</xref>), which were implemented into standalone tools, specifically quantify enrichment of overlaps between GWAS variants and genomic annotations by considering linkage disequilibrium (LD). To overcome the discordant enrichment among exiting methods, Coloc-stats integrates multiple colocalization analysis tools in a single web interface (<xref ref-type="bibr" rid="B33">Simovski et&#xa0;al., 2018</xref>). This integrated system serves as a one-stop shop for performing comprehensive colocalization analysis and asseses the consistency of the conclusions across seven different methods. However, some critical issues remain unaddressed. First, existing tools only provide limited pre-defined sets for genomic features in different biological domains. Current web-based tools, such as GSuite HyperBrowser, GenomeRunner (<xref ref-type="bibr" rid="B10">Dozmorov et&#xa0;al., 2016</xref>) and LOLAweb, only incorporate a small number of epigenomic profiles from ENCODE, Cistrome and other specific annotation datasets, which restrict the broader applications of online colocalization analysis. Second, the descriptions of tissue and cell type information are disordered and only based on free text, making current tools unable to properly classify or group tissues/cell types to inspect the specificity of enrichment. Therefore, a uniform human tissue/cell-type definition is needed. Furthermore, the growing volume of epigenomic profiles on extensive tissues/cell types, collection and integration of these genomic features require a great effort to download. Most colocalization web tools are time-consuming for features intersection and background generation when dealing with such accumulating data scale. To ease the comprehensive colocalization analysis for biologists and geneticists, a faster and versatile online platform would be welcome.</p>
<p>For this study we comprehensively collected and integrated over 44,385 bulk or single cell epigenomic profiles across 53 human tissues/cell types. By classifying and mapping these profiles into hierarchy of tissue/cell type, we developed a web portal, epiCOLOC, for users to perform context-dependent colocalization analysis in a convenient way. We leveraged a recent ultrafast genomics search engine, GIGGLE, to identify and prioritize the enrichment of genomic loci shared between query features and our pre-defined epigenomic interval files (<xref ref-type="bibr" rid="B23">Layer et&#xa0;al., 2018</xref>). epiCOLOC equips many visualization functions and is freely available at <uri xlink:href="http://mulinlab.org/epicoloc">http://mulinlab.org/epicoloc</uri> or <uri xlink:href="http://mulinlab.tmu.edu.cn/epicoloc">http://mulinlab.tmu.edu.cn/epicoloc</uri>.</p>
</sec>
<sec id="s2">
<title>Epigenomic Profiles Integration and Processing</title>
<sec id="s2_1">
<title>Data Collection</title>
<p>We collected human genomic and epigenomic data from various public resources including ENCODE (<xref ref-type="bibr" rid="B7">Consortium, 2012</xref>), Roadmap Epigenomics (<xref ref-type="bibr" rid="B30">Roadmap Epigenomics et&#xa0;al., 2015</xref>), Cistrome (<xref ref-type="bibr" rid="B26">Mei et&#xa0;al., 2017</xref>), ReMap (<xref ref-type="bibr" rid="B5">Cheneby et&#xa0;al., 2018</xref>), ChIP-Atlas (<xref ref-type="bibr" rid="B28">Oki et&#xa0;al., 2018</xref>), DeepBlue (<xref ref-type="bibr" rid="B1">Albrecht et&#xa0;al., 2017</xref>), BOCA (<xref ref-type="bibr" rid="B17">Fullard et&#xa0;al., 2018</xref>), TCGA (<xref ref-type="bibr" rid="B8">Corces et&#xa0;al., 2018</xref>) and HACER (<xref ref-type="bibr" rid="B38">Wang et&#xa0;al., 2019</xref>) (<xref ref-type="supplementary-material" rid="SM2">
<bold>Supplementary Table 3</bold>
</xref>). According to data sources and corresponding attributes, we classified collected features into following categories: 1) Transcriptional regulator, which incorporates ChIP-seq profiles of large number of transcriptional factors and chromatin remodelers; 2) Histone modification, which incorporates ChIP-seq profiles of different histone modifications; 3) Chromatin accessibility, which contains DNase-seq, ATAC-seq and FAIRE-seq profiles of open chromatin; We also curated several single cell ATAC-seq assays in this category; 4) Transcriptional event, which contains CAGE-seq, GRO-seq and PRO-seq profiles of nascent transcription signals; 5) Chromatin segmentation, which introduces tissue/cell type-specific chromatin states predicted by ChromHMM and Segway (<xref ref-type="fig" rid="f1">
<bold>Figure 1A</bold>
</xref> and <xref ref-type="supplementary-material" rid="SM2">
<bold>Supplementary Table 1</bold>
</xref>). In order to improve accuracy and robustness of epiCOLOC backend database, we removed low-quality profiles according to the quality control scheme provided in the original resource. For example, we removed ChIP-seq data not passing two Cistrome quality metrics, including fraction of reads in peaks, and sufficient number of peaks with good enrichment. We also excluded ENCODE profiles with error audit flags, such as extremely low read length, not tagged antibody, etc. Current epiCOLOC database covers 1,631 chromatin markers, which comprises 88 histone modifications, 1,538 transcriptional regulators, open chromatin and transcriptional event.</p>
<fig id="f1" position="float">
<label>Figure 1</label>
<caption>
<p>The overview of epiCOLOC design and datasets. <bold>(A)</bold> The source schema of epiCOLOC data collection. <bold>(B)</bold> An example to illustrate outlier profiles removing. <bold>(C)</bold> The summary of data types in the current version of epiCOLOC.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fgene-11-00053-g001.tif"/>
</fig>
</sec>
<sec id="s2_2">
<title>Data Processing</title>
<sec id="s2_2_1">
<title>Tissue Organization and Mapping</title>
<p>We mapped cell lines to tissues by accounting for some auxiliary information from original epigenomic studies and several standards from GTEx (<xref ref-type="bibr" rid="B6">Consortium et&#xa0;al., 2017</xref>), Expression Atlas (<xref ref-type="bibr" rid="B29">Papatheodorou et&#xa0;al., 2018</xref>), Cellosaurus (<xref ref-type="bibr" rid="B2">Bairoch, 2018</xref>), ATCC (<uri xlink:href="https://www.atcc.org">www.atcc.org</uri>), and BRENDA Tissue Ontologies (<uri xlink:href="https://www.ebi.ac.uk/ols/ontologies/bto">www.ebi.ac.uk/ols/ontologies/bto</uri>), yielding 53 main human tissues in total. For some main tissues that contain multiple well characterized components or some cell lines that cannot simply map to specific main tissues, we set independent terms in tissue set and finally generated 137 sub-tissues (<xref ref-type="supplementary-material" rid="SM2">
<bold>Supplementary Table 2</bold>
</xref>). We then manually mapped tissue/cell type name of each profile to our uniformly defined tissue set.</p>
</sec>
<sec id="s2_2_2">
<title>Cell Type Mapping</title>
<p>To reduce the complexity of cell type description in our collected epigenomic profiles, we performed cell type mapping using Cellosaurus that collected almost all cell line synonyms in a reference database (<xref ref-type="bibr" rid="B2">Bairoch, 2018</xref>). We acquired the Cellosaurus accession numbers and corresponding synonyms for all recorded cell lines, and assigned uniform synonyms identifiers to epigenomic profiles, which greatly reduces the heterogeneity of cell type descriptions. For cancer cell types mapping, we borrowed DepMap which provides standard terms for over thousands of cancer cell lines and organoid models (<xref ref-type="bibr" rid="B37">Van Der Meer et&#xa0;al., 2019</xref>). Since DepMap provides Cellosaurus accession numbers, we were able to easily map cancer cell lines to consistent reference.</p>
</sec>
<sec id="s2_2_3">
<title>Profile Grouping</title>
<p>Since the epigenomic data were generated by different laboratories or produced using different protocols, replicates and analysis methods among collected sources, we sought to identify profiles describing similar biological processes in each source. We grouped all collected profiles according to source + assay type + tissue/cell type + biological target, and assigned unique group identifiers to them.</p>
</sec>
<sec id="s2_2_4">
<title>Outlier Profiles Removal</title>
<p>To further ensure informative profiles in each group, we designed a strategy to eliminate potential outlier profiles that may deviate from underlying biological process of the group (<xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Methods</bold>
</xref>). For each group with at least three profiles, we first constructed a pair-wise similarity matrix for all profiles based on GIGGLE combo score (<xref ref-type="bibr" rid="B23">Layer et&#xa0;al., 2018</xref>). Then, hierarchical clustering was used to cluster these profiles based on Euclidean distance and the optimal number of clusters was automatically determined by inconsistency coefficient method (<xref ref-type="bibr" rid="B40">Zahn, 1971</xref>). Furthermore, we only retained profiles within the largest cluster as representatives in this group. For example, we identified that four outlier profiles among 11 ETS1 ChIP-seq peak profiles in GM12878 cell line, and excluded them in the colocalization analysis (<xref ref-type="fig" rid="f1">
<bold>Figure 1B</bold>
</xref>).</p>
</sec>
</sec>
<sec id="s2_3">
<title>epiCOLOC Web Tool Implementation</title>
<p>The current version of epiCOLOC incorporates 44,385 tissue/cell type-specific functional profiles from 44,364 bulk-cell studies and 21 single-cell studies after quality control (<xref ref-type="supplementary-material" rid="SM2">
<bold>Supplementary Table 4</bold>
</xref>). Most of these profiles (89.8%) are derived from ChIP-seq for transcription regulators and histone modifications, while, 9.5% profiles came from DNase-seq and ATAC-seq for chromatin accessibility (<xref ref-type="fig" rid="f1">
<bold>Figure 1C</bold>
</xref>).</p>
</sec>
<sec id="s2_4">
<title>Colocalization Method</title>
<p>To achieve a fast and efficient colocalization based on high volume epigenomic features, we embedded a genomic feature search engine, GIGGLE, into epiCOLOC web server (<xref ref-type="bibr" rid="B23">Layer et&#xa0;al., 2018</xref>). GIGGLE uses Fisher's exact test and odds ratio of &#x201c;observed&#x201d; versus &#x201c;expected&#x201d; to measure enrichment between query features and pre-indexed genomic intervals. It also creates a combination score called GIGGLE combo score, which is the product of -log10(Fisher's exact test <italic>P</italic>-value) and log2(odds ratio). Given thousands of epigenomic profiles in epiCOLOC database, GIGGLE can significantly reduce the running time from hours to minutes. For example, epiCOLOC takes about 6 minutes to finish colocalization analysis on transcriptional regulator profiles of all blood cells for a set of 10k intervals (randomly generated genomic intervals with varying length). For each profile group, we calculated median score to represent group-level enrichment. With the aid of efficient colocalization strategy, epiCOLOC tries to provide powerful context-specific epigenomic evidences, leading to novel biological problems identification, such as &#x201c;Are two transcription factors (TFs) colocalized and forming cooperation&#x201d; or &#x201c;Are the query variants/intervals enriched in chromatin open regions of specific tissues?&#x201d; or &#x201c;Are the query variants/intervals overlap with transcribed enhancers regions more than would be expected by chance?&#x201d; More biological examples can be found in our website <uri xlink:href="http://mulinlab.org/epicoloc/Introduction/#Biological-examples">http://mulinlab.org/epicoloc/Introduction/#Biological-examples</uri>.</p>
</sec>
<sec id="s2_5">
<title>Web Interface and Usage</title>
<p>epiCOLOC was implemented in a web-based tool with built-in large-scale and context-dependent epigenomic annotations. The epigenomic profiles were indexed using GIGGLE. The web server was developed by Python, jQuery, igv.js, amcharts.js and related JavaScript modules.</p>
<sec id="s2_5_1">
<title>Querys</title>
<p>epiCOLOC accepts two types of genomic format: BED-like format and VCF-like format. Both plain text and uploaded file of regions of interest (ROIs) or variant positions are well supported. Uploaded file can be BED or VCF text file or compressed gzip file (&lt;20Mb).</p>
</sec>
<sec id="s2_5_2">
<title>Options</title>
<p>epiCOLOC provides several options for users to customize colocalization analysis, including 1) select tissues (53 tissues/137 sub-tissues); 2) select profile categories (Transcriptional regulator, Histone modification, Chromatin accessibility, Transcriptional event, Chromatin segmentation); 3) change human genome assembly (GRCh37 and GRCh38); 4) define background genome size (3,095,677,412 for GRCh37 and 3,088,269,832 for GRCh38 in default); 5) set maximal interval length (500bp in default, and ROIs which exceed maximum length will be removed); 6) set extended length on both sides (no extension by default); 7) set central window size (cut the central area of genomic intervals, no central window by default).</p>
</sec>
<sec id="s2_5_3">
<title>Job Submission</title>
<p>Once submitted, the job will be sent to the backend of the web server for colocalization analysis. epiCOLOC displays a progress bar to track the execution status. It allows job retrieval by searching for the job ID in the home page, or by using a fixed URL (<uri xlink:href="http://mulinlab.org/epicoloc/&lt;jobid&gt;">http://mulinlab.org/epicoloc/&lt;jobid&gt;</uri>) to check results directly, or through email notification.</p>
</sec>
<sec id="s2_5_4">
<title>Results Visualization</title>
<p>We used GIGGLE combo scores to prioritize colocalization results. Higher combo score indicates better enrichment on a specific profile, while negative combo scores suggest depleted enrichment (<xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Figure 1</bold>
</xref>). Users can inspect and visualize the results in four different manners: 1) Prioritization table, which shows statistics metrics of colocalization including combo score, Fisher's exact <italic>P-</italic>value, odds ratio, the number of overlaps and extra information of enriched profiles (<xref ref-type="fig" rid="f2">
<bold>Figure 2A</bold>
</xref>); 2) Tissue-wise pie charts for enrichment and depletion, which depict the per tissue proportion in all enriched (positive combo score) or depleted (negative combo score) profiles (<xref ref-type="fig" rid="f2">
<bold>Figure 2B</bold>
</xref>). Users can click the slice of each tissue in the pie chart to see detailed sub-tissue results; 3) Tissue-wise bar plots, which display the representative enriched or depleted profiles in each tissue (<xref ref-type="fig" rid="f2">
<bold>Figure 2C</bold>
</xref>). The user can search, scroll, zoom and hover over the bar plot to get detailed information of enrichment (only assay IDs for the best profiles in each group are displayed in hover tooltip). Once the label under the tissue-wise bar plotsis clicked, cell type-wise bars which depict enrichment patterns for the top 20 enriched cell types appear in a pop-up window. 4) The IGV dashboard displays relative genomic location for queries genomic intervals and top five enriched profiles in colocalization analysis.</p>
<fig id="f2" position="float">
<label>Figure 2</label>
<caption>
<p>Results page of epiCOLOC. Colocalization result for IBD GWAS variants in open chromatin regions, <bold>(A)</bold> Prioritization table. <bold>(B)</bold> Pie chart that depicts the number of significant enriched or depleted profiles in each tissue. <bold>(C)</bold> Bar plots that display ordered combo score, <italic>P</italic>-value, odds ratio in tissue-wise manner.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fgene-11-00053-g002.tif"/>
</fig>
</sec>
<sec id="s2_5_5">
<title>Download</title>
<p>epiCOLOC allows users to download colocalization results in csv format and result figures in png, jpg or pdf formats.</p>
</sec>
</sec>
</sec>
<sec id="s3">
<title>Case Studies and Evaluations</title>
<p>By integrating large-scale tissue/cell type-specific epigenomic profiles, epiCOLOC could be used to investigate many biological questions. Here, we used several examples to demonstrate the performances and potential usages of epiCOLOC.</p>
<p>To identify potential disease-relevant genomic features and tissues using GWAS variants, we first performed colocalization analysis on disease-associated variants for inflammatory bowel disease (IBD) (<xref ref-type="bibr" rid="B25">Liu et&#xa0;al., 2015</xref>) to test the tissue-specific enrichment. Using chromatin accessibility features, we found that IBD GWAS variants (<italic>P</italic>-value &lt; 5E-8) were significantly enriched in blood tissue, where open chromatin profiles on monocyte, lymphocyte and granulocyte macrophage progenitor received highest enrichment scores. (<xref ref-type="fig" rid="f2">
<bold>Figure 2</bold>
</xref>, and also see colocalization result from: <uri xlink:href="http://mulinlab.org/epicoloc/results/bc2fa49a-6dfa-40f1-bb61-1349c9118168">http://mulinlab.org/epicoloc/results/bc2fa49a-6dfa-40f1-bb61-1349c9118168</uri>). This result was consistent with GARFIELD results using functional annotations from ENCODE and Roadmap Epigenomics (<xref ref-type="bibr" rid="B20">Iotchkova et&#xa0;al., 2019</xref>). We then used coronary artery disease (CAD) GWAS variants (<italic>P</italic>-value &lt; 5E-8) to perform colocalization in open chromatin regions (<xref ref-type="bibr" rid="B36">Van Der Harst and Verweij, 2018</xref>). Consistent with GARFIELD reports, we observed that most of tissues showed similar enrichment patterns, without distinct tissue specificity at open chromatin (<uri xlink:href="http://mulinlab.org/epicoloc/results/63b0cd1b-f22f-43dd-9452-fdea114f6c3d">http://mulinlab.org/epicoloc/results/63b0cd1b-f22f-43dd-9452-fdea114f6c3d</uri>). However, when using fine-mapped CAD variants, we observed several highly enriched signals in tissues like liver and artery blood vessel (<uri xlink:href="http://mulinlab.org/epicoloc/results/04bf79a8-f7cd-4960-913e-5c5c84c05753">http://mulinlab.org/epicoloc/results/04bf79a8-f7cd-4960-913e-5c5c84c05753</uri>), implying that the importance of selecting informative ROIs before colocalization analysis.</p>
<p>Next we sought to demonstrate that whether epiCOLOC could be used to identify potential cooperative factors for given TF. Transcription factor 7-like 2 (TCF7L2), a TF in the Wnt-signaling pathway, has been proven to play a central role in coordinating the expression of proinsulin and forming mature insulin (<xref ref-type="bibr" rid="B43">Zhou et&#xa0;al., 2014</xref>). TCF7L2 binding sites had been reported to colocalize with HNF4alpha and FOXA2 in HepG2 cell (<xref ref-type="bibr" rid="B16">Frietze et&#xa0;al., 2012</xref>). We hence used TCF7L2 ChIP-seq in HepG2 to perform colocalization analysis using epiCOLOC. In our colocalization results, TCF7L2 ChIP-seq peaks were significantly enriched in EP300, CREM, SP1, FOXA2 and HNF4alpha ChIP-seq profiles in various tissues/cell types (<uri xlink:href="http://mulinlab.org/epicoloc/results/d736578a-59a4-4160-a6fe-1a9c420c4adf">http://mulinlab.org/epicoloc/results/d736578a-59a4-4160-a6fe-1a9c420c4adf</uri>). Furthermore, we used two motif finding tools, PscanChIP (<xref ref-type="bibr" rid="B41">Zambelli et&#xa0;al., 2013</xref>) and HOMER (<xref ref-type="bibr" rid="B19">Heinz et&#xa0;al., 2010</xref>), with the same query input to investigated enriched TF motifs. We found that TF motifs including HNF4alpha, FOXA2, TCF7, GATA4, FOXP1, FOXA1, FOXK2 and FOXO3 can be simultaneously identified among two motif finding tools and our epiCOLOC, which also validates the efficacy of our tool.</p>
</sec>
<sec id="s4" sec-type="discussion">
<title>Discussion</title>
<p>In this study, we have integrated a comprehensive and tissue/cell type-specific epigenomics profiles database. With strict pre-processing, quality control and tissue mapping, we established a user-friendly web portal, epiCOLOC, which to perform fast and context-dependent colocalization analysis; and provide a series of visualization functions to interpret results; and significantly distinguish between existing web-based tools (<xref ref-type="supplementary-material" rid="SM2">
<bold>Supplementary Table 5</bold>
</xref>). In the applied examples, we demonstrated the accuracy and practicality of epiCOLOC in identifying causal tissues/cell types from GWAS disease-associated variants and inferring co-occurrence of transcription regulators.</p>
<p>There are some limitations in this work which deserve optimization in our future works. First, the statistical assumption of GIGGLE is simple and could be sub-optimal in several cases. We strongly recommend users to prioritize results by combo score and set stringent thresholds. As observed from the combo scores distribution when <italic>P</italic>&lt; = 0.05 using query intervals that randomly generated in genome (<xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Figure 2</bold>
</xref>), we propose to use an empirical combo score cutoff, 5 for enrichment and -2 for depletion, as advisable criteria to further filter enrichment or depletion results. Although GIGGLE can greatly speed up colocalization analysis, as compared with GenomeRunner (<xref ref-type="bibr" rid="B10">Dozmorov et&#xa0;al., 2016</xref>) and LOLAweb (<xref ref-type="bibr" rid="B27">Nagraj et&#xa0;al., 2018</xref>), it limits the usage of user-specific background of genomic regions and the analysis of multiple genomic intervals. Second, although epiCOLOC is applicable to perform colocalization analysis using genetic variants, but it cannot account for LD and allele frequency. Third, there are uneven epigenomic profiles for different tissues/cell types. It may potentially affect the robustness of colocalization when applying epiCOLOC to the tissues/cell types having fewer data available, and it also cannot determine the missing enrichment for tissues/cell types lacking sufficient data. In addition, single-cell technologies, such as single-cell ATAC-seq and single-cell ChIP-seq (<xref ref-type="bibr" rid="B18">Grosselin et&#xa0;al., 2019</xref>), have been developed to analyze genome-wide epigenomic features. Such approaches pave the way to study the role of epigenetic heterogeneity in many biological conditions and will be largely incorporated into epiCOLOC in the next stage. Recently, a novel algorithm named Augmented Interval List (AIList) (<xref ref-type="bibr" rid="B15">Feng et&#xa0;al., 2019</xref>), which introduces a new data structure and provides a significantly improved fundamental operation for highly scalable genomic data analysis. This method together with upcoming large-scale genomic features will be added in the epiCOLOC future updates.</p>
</sec>
<sec id="s5">
<title>Data Availability Statement</title>
<p>Publicly available datasets were analyzed in this study. This data can be found from ENCODE, Roadmap Epigenomics, etc and also related sources has been listed here: <uri xlink:href="http://mulinlab.org/epicoloc/Introduction/">http://mulinlab.org/epicoloc/Introduction/</uri>.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>ML designed and guided the study, YZ, YS and DH developed the tool, YZ and ML wrote the manuscript.</p>
</sec>
<sec id="s7" sec-type="funding-information">
<title>Funding</title>
<p>This work was supported by grants from the National Natural Science Foundation of China 31871327, 31701143 (ML), Natural Science Foundation of Tianjin 18JCZDJC34700, 19JCJQJC63600 (ML). We also appreciate all tool and resource providers.</p>
</sec>
<sec id="s8">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<sec id="s9" sec-type="supplementary-material">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fgene.2020.00053/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fgene.2020.00053/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="DataSheet_1.docx" id="SM1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document"/>
<supplementary-material xlink:href="DataSheet_2.xlsx" id="SM2" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Albrecht</surname> <given-names>F.</given-names>
</name>
<name>
<surname>List</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Bock</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Lengauer</surname> <given-names>T.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>DeepBlueR: large-scale epigenomic analysis in R</article-title>. <source>Bioinformatics</source> <volume>33</volume>, <fpage>2063</fpage>&#x2013;<lpage>2064</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btx099</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bairoch</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>The cellosaurus, a cell-line knowledge resource</article-title>. <source>J. Biomol. Tech.</source> <volume>29</volume>, <fpage>25</fpage>&#x2013;<lpage>38</lpage>. doi: <pub-id pub-id-type="doi">10.7171/jbt.18-2902-002</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bujold</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Morais</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Gauthier</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Cote</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Caron</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Kwan</surname> <given-names>T.</given-names>
</name>
<etal/>
</person-group>. (<year>2016</year>). <article-title>The international human epigenome consortium data portal</article-title>. <source>Cell Syst.</source> <volume>3</volume>, <fpage>496</fpage>&#x2013;<lpage>499</lpage>. e492. doi: <pub-id pub-id-type="doi">10.1016/j.cels.2016.10.019</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cazaly</surname> <given-names>E.</given-names>
</name>
<name>
<surname>Saad</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Heckman</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Ollikainen</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Tang</surname> <given-names>J.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Making sense of the epigenome using data integration approaches</article-title>. <source>Front. Pharmacol.</source> <volume>10</volume>, <elocation-id>126</elocation-id>. doi: <pub-id pub-id-type="doi">10.3389/fphar.2019.00126</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheneby</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Gheorghe</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Artufel</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Mathelier</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Ballester</surname> <given-names>B.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>ReMap 2018: an updated atlas of regulatory regions from an integrative analysis of DNA-binding ChIP-seq experiments</article-title>. <source>Nucleic Acids Res.</source> <volume>46</volume>, <fpage>D267</fpage>&#x2013;<lpage>D275</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkx1092</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<collab>Consortium G. T.</collab>
<collab>Laboratory D. A.</collab>
<collab>Coordinating Center -Analysis Working, G. Statistical Methods Groups-Analysis Working, G.</collab>
<collab>Enhancing G. G.</collab>
<collab>Fund N. I. H. C.</collab>
<etal/>
</person-group> (<year>2017</year>). <article-title>Genetic effects on gene expression across human tissues</article-title>. <source>Nature</source> <volume>550</volume>, <fpage>204</fpage>&#x2013;<lpage>213</lpage>. doi: <pub-id pub-id-type="doi">10.1038/nature24277</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Consortium</surname> <given-names>E. P.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>An integrated encyclopedia of DNA elements in the human genome</article-title>. <source>Nature</source> <volume>489</volume>, <fpage>57</fpage>&#x2013;<lpage>74</lpage>. doi: <pub-id pub-id-type="doi">10.1038/nature11247</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Corces</surname> <given-names>M. R.</given-names>
</name>
<name>
<surname>Granja</surname> <given-names>J. M.</given-names>
</name>
<name>
<surname>Shams</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Louie</surname> <given-names>B. H.</given-names>
</name>
<name>
<surname>Seoane</surname> <given-names>J. A.</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>W.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>The chromatin accessibility landscape of primary human cancers</article-title>. <source>Science</source> <volume>362</volume>, <fpage>eaav1898</fpage>. doi: <pub-id pub-id-type="doi">10.1126/science.aav1898</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dor</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Cedar</surname> <given-names>H.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Principles of DNA methylation and their implications for biology and medicine</article-title>. <source>Lancet</source> <volume>392</volume>, <fpage>777</fpage>&#x2013;<lpage>786</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0140-6736(18)31268-6</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dozmorov</surname> <given-names>M. G.</given-names>
</name>
<name>
<surname>Cara</surname> <given-names>L. R.</given-names>
</name>
<name>
<surname>Giles</surname> <given-names>C. B.</given-names>
</name>
<name>
<surname>Wren</surname> <given-names>J. D.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>GenomeRunner web server: regulatory similarity and differences define the functional impact of SNP sets</article-title>. <source>Bioinformatics</source> <volume>32</volume>, <fpage>2256</fpage>&#x2013;<lpage>2263</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btw169</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dozmorov</surname> <given-names>M. G.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Epigenomic annotation-based interpretation of genomic data: from enrichment analysis to machine learning</article-title>. <source>Bioinformatics</source> <volume>33</volume>, <fpage>3323</fpage>&#x2013;<lpage>3330</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btx414</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Egtex</surname> <given-names>G. P.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Enhancing GTEx by bridging the gaps between genotype, gene expression, and disease</article-title>. <source>Nat. Genet.</source> <volume>49</volume>, <fpage>1664</fpage>&#x2013;<lpage>1670</lpage>. doi: <pub-id pub-id-type="doi">10.1038/ng.3969</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Farh</surname> <given-names>K. K.</given-names>
</name>
<name>
<surname>Marson</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Zhu</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Kleinewietfeld</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Housley</surname> <given-names>W. J.</given-names>
</name>
<name>
<surname>Beik</surname> <given-names>S.</given-names>
</name>
<etal/>
</person-group>. (<year>2015</year>). <article-title>Genetic and epigenetic fine mapping of causal autoimmune disease variants</article-title>. <source>Nature</source> <volume>518</volume>, <fpage>337</fpage>&#x2013;<lpage>343</lpage>. doi: <pub-id pub-id-type="doi">10.1038/nature13835</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feinberg</surname> <given-names>A. P.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>The key role of epigenetics in human disease prevention and mitigation</article-title>. <source>N. Engl. J. Med.</source> <volume>378</volume>, <fpage>1323</fpage>&#x2013;<lpage>1334</lpage>. doi: <pub-id pub-id-type="doi">10.1056/NEJMra1402513</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feng</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Ratan</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Sheffield</surname> <given-names>N. C.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Augmented interval list: a novel data structure for efficient genomic interval search</article-title>. <source>Bioinformatics.</source> <volume>35</volume>, <fpage>4907</fpage>&#x2013;<lpage>4911</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btz407</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Frietze</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Yao</surname> <given-names>L. J.</given-names>
</name>
<name>
<surname>Tak</surname> <given-names>Y. G.</given-names>
</name>
<name>
<surname>Ye</surname> <given-names>Z. Q.</given-names>
</name>
<name>
<surname>Gaddis</surname> <given-names>M.</given-names>
</name>
<etal/>
</person-group>. (<year>2012</year>). <article-title>Cell type-specific binding patterns reveal that TCF7L2 can be tethered to the genome by association with GATA3</article-title>. <source>Genome Biol.</source> <volume>13</volume>, <fpage>R52</fpage>. doi: <pub-id pub-id-type="doi">10.1186/gb-2012-13-9-r52</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fullard</surname> <given-names>J. F.</given-names>
</name>
<name>
<surname>Hauberg</surname> <given-names>M. E.</given-names>
</name>
<name>
<surname>Bendl</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Egervari</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Cirnaru</surname> <given-names>M. D.</given-names>
</name>
<name>
<surname>Reach</surname> <given-names>S. M.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>An atlas of chromatin accessibility in the adult human brain</article-title>. <source>Genome Res.</source> <volume>28</volume>, <fpage>1243</fpage>&#x2013;<lpage>1252</lpage>. doi: <pub-id pub-id-type="doi">10.1101/gr.232488.117</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grosselin</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Durand</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Marsolier</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Poitou</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Marangoni</surname> <given-names>E.</given-names>
</name>
<name>
<surname>Nemati</surname> <given-names>F.</given-names>
</name>
<etal/>
</person-group>. (<year>2019</year>). <article-title>High-throughput single-cell ChIP-seq identifies heterogeneity of chromatin states in breast cancer</article-title>. <source>Nat. Genet.</source> <volume>51</volume>, <fpage>1060</fpage>&#x2013;<lpage>1066</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41588-019-0424-9</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Heinz</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Benner</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Spann</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Bertolino</surname> <given-names>E.</given-names>
</name>
<name>
<surname>Lin</surname> <given-names>Y. C.</given-names>
</name>
<name>
<surname>Laslo</surname> <given-names>P.</given-names>
</name>
<etal/>
</person-group>. (<year>2010</year>). <article-title>Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities</article-title>. <source>Mol. Cell</source> <volume>38</volume>, <fpage>576</fpage>&#x2013;<lpage>589</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.molcel.2010.05.004</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Iotchkova</surname> <given-names>V.</given-names>
</name>
<name>
<surname>Ritchie</surname> <given-names>G. R. S.</given-names>
</name>
<name>
<surname>Geihs</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Morganella</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Min</surname> <given-names>J. L.</given-names>
</name>
<name>
<surname>Walter</surname> <given-names>K.</given-names>
</name>
<etal/>
</person-group>. (<year>2019</year>). <article-title>GARFIELD classifies disease-relevant genomic features through integration of functional annotations with association signals</article-title>. <source>Nat. Genet.</source> <volume>51</volume>, <fpage>343</fpage>&#x2013;<lpage>34+</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41588-018-0322-6</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kanduri</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Bock</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Gundersen</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Hovig</surname> <given-names>E.</given-names>
</name>
<name>
<surname>Sandve</surname> <given-names>G. K.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Colocalization analyses of genomic elements: approaches, recommendations and challenges</article-title>. <source>Bioinformatics</source> <volume>35</volume>, <fpage>1615</fpage>&#x2013;<lpage>1624</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/bty835</pub-id>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lawrence</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Daujat</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Schneider</surname> <given-names>R.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Lateral thinking: how histone modifications regulate gene expression</article-title>. <source>Trends Genet.</source> <volume>32</volume>, <fpage>42</fpage>&#x2013;<lpage>56</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.tig.2015.10.007</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Layer</surname> <given-names>R. M.</given-names>
</name>
<name>
<surname>Pedersen</surname> <given-names>B. S.</given-names>
</name>
<name>
<surname>Disera</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Marth</surname> <given-names>G. T.</given-names>
</name>
<name>
<surname>Gertz</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Quinlan</surname> <given-names>A. R.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>GIGGLE: a search engine for large-scale integrated genome analysis</article-title>. <source>Nat. Methods</source> <volume>15</volume>, <fpage>123</fpage>&#x2013;<lpage>126</lpage>. doi: <pub-id pub-id-type="doi">10.1038/nmeth.4556</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Libbrecht</surname> <given-names>M. W.</given-names>
</name>
<name>
<surname>Rodriguez</surname> <given-names>O. L.</given-names>
</name>
<name>
<surname>Weng</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Bilmes</surname> <given-names>J. A.</given-names>
</name>
<name>
<surname>Hoffman</surname> <given-names>M. M.</given-names>
</name>
<name>
<surname>Noble</surname> <given-names>W. S.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A unified encyclopedia of human functional DNA elements through fully automated annotation of 164 human cell types</article-title>. <source>Genome Biol.</source> <volume>20</volume>, <fpage>180</fpage>. doi: <pub-id pub-id-type="doi">10.1186/s13059-019-1784-2</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname> <given-names>J. Z.</given-names>
</name>
<name>
<surname>Van Sommeren</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Ng</surname> <given-names>S. C.</given-names>
</name>
<name>
<surname>Alberts</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Takahashi</surname> <given-names>A.</given-names>
</name>
<etal/>
</person-group>. (<year>2015</year>). <article-title>Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations</article-title>. <source>Nat. Genet.</source> <volume>47</volume>, <fpage>979</fpage>&#x2013;<lpage>986</lpage>. doi: <pub-id pub-id-type="doi">10.1038/ng.3359</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mei</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Qin</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Zheng</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Zang</surname> <given-names>C.</given-names>
</name>
<etal/>
</person-group>. (<year>2017</year>). <article-title>Cistrome data browser: a data portal for ChIP-Seq and chromatin accessibility data in human and mouse</article-title>. <source>Nucleic Acids Res.</source> <volume>45</volume>, <fpage>D658</fpage>&#x2013;<lpage>D662</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkw983</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nagraj</surname> <given-names>V. P.</given-names>
</name>
<name>
<surname>Magee</surname> <given-names>N. E.</given-names>
</name>
<name>
<surname>Sheffield</surname> <given-names>N. C.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>LOLAweb: a containerized web server for interactive genomic locus overlap enrichment analysis</article-title>. <source>Nucleic Acids Res.</source> <volume>46</volume>, <fpage>W194</fpage>&#x2013;<lpage>W199</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gky464</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Oki</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Ohta</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Shioi</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Hatanaka</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Ogasawara</surname> <given-names>O.</given-names>
</name>
<name>
<surname>Okuda</surname> <given-names>Y.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>ChIP-Atlas: a data-mining suite powered by full integration of public ChIP-seq data</article-title>. <source>EMBO Rep.</source> <volume>19</volume>, <fpage>e46255</fpage>. doi: <pub-id pub-id-type="doi">10.15252/embr.201846255</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Papatheodorou</surname> <given-names>I.</given-names>
</name>
<name>
<surname>Fonseca</surname> <given-names>N. A.</given-names>
</name>
<name>
<surname>Keays</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Tang</surname> <given-names>Y. A.</given-names>
</name>
<name>
<surname>Barrera</surname> <given-names>E.</given-names>
</name>
<name>
<surname>Bazant</surname> <given-names>W.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>Expression Atlas: gene and protein expression across multiple studies and organisms</article-title>. <source>Nucleic Acids Res.</source> <volume>46</volume>, <fpage>D246</fpage>&#x2013;<lpage>D251</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkx1158</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roadmap Epigenomics</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Kundaje</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Meuleman</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Ernst</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Bilenky</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Yen</surname> <given-names>A.</given-names>
</name>
<etal/>
</person-group>. (<year>2015</year>). <article-title>Integrative analysis of 111 reference human epigenomes</article-title>. <source>Nature</source> <volume>518</volume>, <fpage>317</fpage>&#x2013;<lpage>330</lpage>. doi: <pub-id pub-id-type="doi">10.1038/nature14248</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sheffield</surname> <given-names>N. C.</given-names>
</name>
<name>
<surname>Bock</surname> <given-names>C.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>LOLA: enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor</article-title>. <source>Bioinformatics</source> <volume>32</volume>, <fpage>587</fpage>&#x2013;<lpage>589</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btv612</pub-id>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Simovski</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Vodak</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Gundersen</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Domanska</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Azab</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Holden</surname> <given-names>L.</given-names>
</name>
<etal/>
</person-group>. (<year>2017</year>). <article-title>GSuite HyperBrowser: integrative analysis of dataset collections across the genome and epigenome</article-title>. <source>Gigascience</source> <volume>6</volume>, <fpage>1</fpage>&#x2013;<lpage>12</lpage>. doi: <pub-id pub-id-type="doi">10.1093/gigascience/gix032</pub-id>
</citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Simovski</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Kanduri</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Gundersen</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Titov</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Domanska</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Bock</surname> <given-names>C.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>Coloc-stats: a unified web interface to perform colocalization analysis of genomic features</article-title>. <source>Nucleic Acids Res.</source> <volume>46</volume>, <fpage>W186</fpage>&#x2013;<lpage>W193</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gky474</pub-id>
</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stunnenberg</surname> <given-names>H. G.</given-names>
</name>
<collab>International Human Epigenome, C.</collab>
<name>
<surname>Hirst</surname> <given-names>M.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>The international human epigenome consortium: a blueprint for scientific collaboration and discovery</article-title>. <source>Cell</source> <volume>167</volume>, <fpage>1145</fpage>&#x2013;<lpage>1149</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cell.2016.11.007</pub-id>
</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Trynka</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Westra</surname> <given-names>H. J.</given-names>
</name>
<name>
<surname>Slowikowski</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Hu</surname> <given-names>X. L.</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Stranger</surname> <given-names>B. E.</given-names>
</name>
<etal/>
</person-group>. (<year>2015</year>). <article-title>Disentangling the effects of colocalizing genomic annotations to functionally prioritize non-coding variants within complex-trait loci</article-title>. <source>Am. J. Hum. Genet.</source> <volume>97</volume>, <fpage>139</fpage>&#x2013;<lpage>152</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ajhg.2015.05.016</pub-id>
</citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Van Der Harst</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Verweij</surname> <given-names>N.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease</article-title>. <source>Circ. Res.</source> <volume>122</volume>, <fpage>433</fpage>&#x2013;<lpage>443</lpage>. doi: <pub-id pub-id-type="doi">10.1161/CIRCRESAHA.117.312086</pub-id>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Van Der Meer</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Barthorpe</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Lightfoot</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Hall</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Gilbert</surname> <given-names>J.</given-names>
</name>
<etal/>
</person-group>. (<year>2019</year>). <article-title>Cell model passports-a hub for clinical, genetic and functional datasets of preclinical cancer models</article-title>. <source>Nucleic Acids Res.</source> <volume>47</volume>, <fpage>D923</fpage>&#x2013;<lpage>D929</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gky872</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Dai</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Berry</surname> <given-names>L. D.</given-names>
</name>
<name>
<surname>Cogan</surname> <given-names>J. D.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Shyr</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>HACER: an atlas of human active enhancers to interpret regulatory variants</article-title>. <source>Nucleic Acids Res.</source> <volume>47</volume>, <fpage>D106</fpage>&#x2013;<lpage>D112</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gky864</pub-id>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Enge</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Whitington</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Dave</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Sur</surname> <given-names>I.</given-names>
</name>
<etal/>
</person-group>. (<year>2013</year>). <article-title>Transcription factor binding in human cells occurs in dense clusters formed around cohesin anchor sites</article-title>. <source>Cell</source> <volume>154</volume>, <fpage>801</fpage>&#x2013;<lpage>813</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cell.2013.07.034</pub-id>
</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zahn</surname> <given-names>C. T.</given-names>
</name>
</person-group> (<year>1971</year>). <article-title>Graph-theoretical methods for detecting and describing gestalt clusters</article-title>. <source>IEEE Trans. Comput.</source> <volume>20</volume>, <fpage>68</fpage>&#x2013;<lpage>86</lpage>. doi: <pub-id pub-id-type="doi">10.1109/T-C.1971.223083</pub-id>
</citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zambelli</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Pesole</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Pavesi</surname> <given-names>G.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>PscanChIP: finding over-represented transcription factor-binding site motifs and their correlations in sequences from ChIP-Seq experiments</article-title>. <source>Nucleic Acids Res.</source> <volume>41</volume>, <fpage>W535</fpage>&#x2013;<lpage>W543</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkt448</pub-id>
</citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zheng</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Wan</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Mei</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Qin</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>H.</given-names>
</name>
<etal/>
</person-group>. (<year>2019</year>). <article-title>Cistrome data browser: expanded datasets and new tools for gene regulatory analysis</article-title>. <source>Nucleic Acids Res.</source> <volume>47</volume>, <fpage>D729</fpage>&#x2013;<lpage>D735</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gky1094</pub-id>
</citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname> <given-names>Y. D.</given-names>
</name>
<name>
<surname>Park</surname> <given-names>S. Y.</given-names>
</name>
<name>
<surname>Su</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Bailey</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Ottosson-Laakso</surname> <given-names>E.</given-names>
</name>
<name>
<surname>Shcherbina</surname> <given-names>L.</given-names>
</name>
<etal/>
</person-group>. (<year>2014</year>). <article-title>TCF7L2 is a master regulator of insulin production and processing</article-title>. <source>Hum. Mol. Genet.</source> <volume>23</volume>, <fpage>6419</fpage>&#x2013;<lpage>6431</lpage>. doi: <pub-id pub-id-type="doi">10.1093/hmg/ddu359</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>