EDITORIAL article

Front. Cell Dev. Biol., 07 July 2015

Sec. Systems Biology Archive

Volume 3 - 2015 | https://doi.org/10.3389/fcell.2015.00046

Editorial: Multi-omic data integration

  • 1. Lazzari Bologna, Italy

  • 2. Group of Clinical Genomic Networks, Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences Shanghai, China

  • 3. Quintiles Reading, UK

  • 4. Consiglio Nazionale delle Ricerche, Istituto per le Applicazioni del Calcolo Rome, Italy

As researchers involved in molecular biology, we are witnessing tremendous paradigm changes in a time frame that becomes shorter and shorter. The epoch-making notion, originally put forward by the central dogma of biology (Crick, 1970), that there is a unidirectional process and a privileged level (genetic) of causality at which biological functions are determined, has already long and strongly been challenged. It is in fact well recognised that multi-level causality with feedback cycles among all former and newly identified biochemical levels (including small RNAs, epigenomic changes) is a fundamental attribute of biological systems (Noble, 2012).

Yet, the focus shift from single reactions to transcriptomics, promoted by microarray first and sequencers now, is already challenged by a novel, pressing offer from fast evolving technologies. Indeed, the possibility to have a omic view on virtually all molecular layers (genomes, metagenomes, transcriptomes, proteomes, epigenomes) pushes to integrate the study of systems at yet another level of complexity, a run harmed, and not negligibly, by the difficulties in formatting, storing, and reusing the deluge of data encompassing every level of biological organization.

In such a complex background, it is growingly acknowledged that tools and theoretical frameworks that could help in combining and giving account for both the multi-level causation scheme and the burden of data are still underdeveloped (Witzany and Baluska, 2012).

From these considerations, a novel, pressing request arises to design methodologies, approaches and frameworks that allow for these data to be interpreted as a whole, i.e., as intertwined molecular signatures containing genes, proteins, mRNAs, and miRNAs, but also epigenomic characterizations, as well as correlations with microbiomes' compositions, just to name the major, able to capture the inter-layers connections and the complexity of phenotypes. This request is seconded by demands and concerns about the storage and reusability of much of such different omic data. Indeed, although publicly and freely available, these data often lie in databases and repositories underutilized or not used at all. Issues coming from lack of standardization and shared biological identities are also well known to represent a hurdle for data reuse (Tieri and Nardini, 2013; Chowdhury and Sarkar, 2015).

The “Multi-Omic Data Integration” Research Topic is in our intention a dedicated forum to collect efforts that help in defining this emerging field, aimed to the integration of data, analyses and approaches from, and for multiple omics.

The articles here collected address these questions from a number of perspectives that we summarize as experimental, network based, and methodological. In the first category the authors extract and analyse different types of high-throughput data (epitomics, localisomics, transcriptomics, lipidomics) from different locations on model organisms [Arabidopsis thaliana (Wilson et al., 2015) and rhesus macaques (Lee et al., 2014)] to understand a complex biological question (roots' growth and response to anti-malarial drugs) that could not be addressed with single-omic approaches.

We transition from these approaches to more theoretical ones via the usage of graphs. Networks offer a complete, intuitive, versatile, and powerful approach to the representation of complex systems (genomics, epigenomics, transcriptomics, metabolomics, host-microbiome interface, diseases' phenomics) which is here exploited to represent the multifaceted aspects of complex autoimmune diseases (rheumatoid arthritis, Tieri et al., 2014) in order to evaluate complex side effects of old and novel therapies; to identify disease molecules that can be both effective therapeutic targets relevant progression markers with application to diabetic nephropathy (Heinzel et al., 2014); to stratify patients with comorbidities (Moni and Lio, 2015).

Methodological approaches point with a novel emphasis at the importance of molecules' spatial localization in the omic context. From polysome and ribosome profiling, RNA, and miRNA binding sites annotation and standardization (Dassi and Quattrone, 2014), to networks including 3D molecules' proximity thanks to Chromosome Conformation Capture (3C) and its omic version Hi-C (Merelli et al., 2015), spatial representation contributes with an important layer of information in this added multi-omic complexity.

Beyond spatial organization, temporal progression and causal inference are discussed to model the heterogeneity of CD4+ T cells and their complex immune responses (Carbo et al., 2014), and to predict gene networks based on ChIP-seq and RNA-seq integration (Angelini and Costa, 2014).

Finally, meta analyses of genomes, be it for the exploration of microbiomes' compositions or disease genome-wide association studies (GWAS) still benefit from discussion in this research topic, on one side for the need of standardization of the workflow (Ladoukakis et al., 2014) in a relatively novel research area (omic microbiology) and on the other side to compensate with multi-omic layers to the limited statistical power and reproducibility of GWAS (Lin et al., 2014).

This collection is the tip of an iceberg that continues to grow and to evolve in multiple directions. From the continuously improving efficiency of existing high-throughput platforms that imply easier, cheaper and more frequent spatio-temporal sampling, to the input of novel technologies that will offer omic views on novel types of data (phenotypes, tissues, 3D proteins etc., all entailing the production and approval of dedicated standards for data storage) we are only at the beginning of almost endless possibilities of data integration.

However, to avoid getting lost in the sea of data, efficient algorithms as well as biologically meaningful directions in which to integrate information will be of importance. This will imply not only the implementation of powerful tools to give answers, but also the design of careful approaches to form questions.

We hope and foresee that these needs will foster the collaboration between biologists, medical doctors, statisticians, and computer scientists further, transforming the residual perception of this forced cooperation from a burden to a resource. The impact of completing this other type of integration among scientific expertise is difficult to predict at large, but can easily be assumed as a necessary and crucial starting point for the effective implementation of personalized medicine, where patients' and health practitioners' needs are translated into technology and report on systemic markers, offering patients the possibility to be treated as a whole and not as a mere assemblage of parts to be “adjusted.”

Funding

This project is partialy funded by MoST international cooperation program no. 2013DFA30790, NSFC no. 31070748, EC FP7-PEOPLE-2011-IRSES program, project 294935 “KEPAMOD,” and CAS fellow grant no. 2011Y1SA04.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Statements

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  • 1

    AngeliniC.CostaV. (2014). Understanding gene regulatory mechanisms by integrating ChIP-seq and RNA-seq data: statistical solutions to biological problems. Front. Cell Dev. Biol. 2:51. 10.3389/fcell.2014.00051

  • 2

    CarboA.HontecillasR.AndrewT.EdenK.MeiY.HoopsS.et al. (2014). Computational modeling of heterogeneity and function of CD4+ T cells. Front. Cell Dev. Biol. 2:31. 10.3389/fcell.2014.00031

  • 3

    ChowdhuryS.SarkarR. R. (2015). Comparison of human cell signaling pathway databases–evolution, drawbacks and challenges. Database (Oxford)2015. 10.1145/2752746

  • 4

    CrickF. (1970). Central dogma of molecular biology. Nature227, 561563. 10.1038/227561a0

  • 5

    DassiE.QuattroneA. (2014). Fingerprints of a message: integrating positional information on the transcriptome. Front. Cell Dev. Biol. 2:39. 10.3389/fcell.2014.00039

  • 6

    HeinzelA.PercoP.MayerG.OberbauerR.LukasA.MayerB. (2014). From molecular signatures to predictive biomarkers: modeling disease pathophysiology and drug mechanism of action. Front. Cell Dev. Biol. 2:37. 10.3389/fcell.2014.00037

  • 7

    LadoukakisE.KolisisF. N.ChatziioannouA. A. (2014). Integrative workflows for metagenomic analysis. Front. Cell Dev. Biol. 2:70. 10.3389/fcell.2014.00070

  • 8

    LeeK. J.YinW.ArafatD.TangY.UppalK.TranV.et al. (2014). Comparative transcriptomics and metabolomics in a rhesus macaque drug administration study. Front. Cell Dev. Biol. 2:54. 10.3389/fcell.2014.00054

  • 9

    LinD.ZhangJ.LiJ.HeH.DengH. W.WangY. P. (2014). Integrative analysis of multiple diverse omics datasets by sparse group multitask regression. Front. Cell Dev. Biol. 2:62. 10.3389/fcell.2014.00062

  • 10

    MerelliI.TordiniF.DroccoM.AldinucciM.LioP.MilanesiL. (2015). Integrating multi-omic features exploiting chromosome conformation capture data. Front. Genet. 6:40. 10.3389/fgene.2015.00040

  • 11

    MoniM. A.LioP. (2015). How to build personalized multi-omics comorbidity profiles. Front. Cell Dev. Biol. 3:28. 10.3389/fcell.2015.00028

  • 12

    NobleD. (2012). A theory of biological relativity: no privileged level of causation. Interface Focus2, 5564. 10.1098/rsfs.2011.0067

  • 13

    TieriP.NardiniC. (2013). Signalling pathway database usability: lessons learned. Mol. Biosyst. 9, 24012407. 10.1039/c3mb70242a

  • 14

    TieriP.ZhouX.ZhuL.NardiniC. (2014). Multi-omic landscape of rheumatoid arthritis: re-evaluation of drug adverse effects. Front. Cell Dev. Biol. 2:59. 10.3389/fcell.2014.00059

  • 15

    WilsonM. H.HolmanT. J.SorensenI.Cancho-SanchezE.WellsD. M.SwarupR.et al. (2015). Multi-omics analysis identifies genes mediating the extension of cell walls in the Arabidopsis thaliana root elongation zone. Front. Cell Dev. Biol. 3:10. 10.3389/fcell.2015.00010

  • 16

    WitzanyG.BaluskaF. (2012). Life's code script does not code itself. The machine metaphor for living organisms is outdated. EMBO Rep. 13, 10541056. 10.1038/embor.2012.166

Summary

Keywords

multi-omics, multi-omic data integration, integration, systems biology, network analysis

Citation

Nardini C, Dent J and Tieri P (2015) Editorial: Multi-omic data integration. Front. Cell Dev. Biol. 3:46. doi: 10.3389/fcell.2015.00046

Received

27 May 2015

Accepted

25 June 2015

Published

07 July 2015

Volume

3 - 2015

Edited by

Raina Robeva, Sweet Briar College, USA

Reviewed by

Matteo Barberis, University of Amsterdam, Netherlands

Copyright

*Correspondence: Paolo Tieri,

This article was submitted to Systems Biology, a section of the journal Frontiers in Cell and Developmental Biology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics