Impact Factor 3.074

The world's most-cited Neurosciences journals

Technology Report ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Neuroinform. | doi: 10.3389/fninf.2018.00091

Integration of “omics” data and phenotypic data within a unified extensible multimodal framework

 Samir Das1, 2,  Xavier Lecours Boucher1, 2*,  Christine Rogers1, 2, François Chouinard-Decorte1, 2,  Kathleen Oros Klein3, 4, Carolina Makowski1, 2, 5, Natacha Beck1, 2, Pierre Rioux1, 2,  Shawn T. Brown1, 2,  Zia Mohaddes1, 2, Cole Zweber1, 2, Victoria Foing1, 2, Marie Forest3, 6, Kieran O’Donnell3, 5, Joanne Clark3,  Celia M. Greenwood3, 6,  Michael J. Meaney3, 5 and  Alan C. Evans1, 2
  • 1McGill Centre for Integrative Neuroscience, Montreal Neurological Institute, Canada
  • 2Montreal Neurological Institute, Mcgill University, Canada
  • 3Ludmer Centre for Neuroinformatics and Mental Health, McGill University, Canada
  • 4Lady Davis Institute (LDI), Canada
  • 5Douglas Hospital Research Centre, Canada
  • 6Jewish General Hospital, Canada

Analysis of “omics” data is often a long and segmented process, encompassing multiple stages from initial data collection to processing, quality control, and visualization. The cross-modal nature of recent genomic analyses renders this process challenging to both automate and standardize; consequently, users often resort to manual interventions that compromise data reliability and reproducibility. This in turn can produce multiple versions of datasets across storage systems. As a result, scientists can lose significant time and resources trying to execute and monitor their analytical workflows and encounter difficulties sharing versioned data. In 2015, the Ludmer Centre for Neuroinformatics and Mental Health at McGill University brought together expertise from the Douglas Mental Health University Institute, the Lady Davis Institute, and the Montreal Neurological Institute (MNI) to form a genetics/epigenetics working group. The objectives of this working group are to i) design an automated and seamless process for (epi)genetic data that consolidates heterogeneous datasets into the LORIS open-source data platform, ii) streamline data analysis, iii) integrate results with provenance information, and iv) facilitate structured and versioned sharing of pipelines for optimized reproducibility using high-performance computing (HPC) environments via the CBRAIN processing portal. This paper outlines the resulting generalizable “omics” framework and its benefits, specifically, the ability to i) integrate multiple types of biological and multi-modal datasets (imaging, clinical, demographics and behavioural), ii) automate the process of launching analysis pipelines on HPC platforms, iii) remove the bioinformatic barriers that are inherent to this process, iv) ensure standardization and transparent sharing of processing pipelines to improve computational consistency, v) store results in a queryable web interface, vi) offer visualization tools to better view the data, and vii) provide the mechanisms to ensure usability and reproducibility. This framework for workflows facilitates brain research discovery by reducing human error through automation of analysis pipelines and seamless linking of multimodal data, allowing investigators to focus on research instead of data handling.

Keywords: workflow, Omics analysis, Longitudinal Studies, integrative neuroscience, Biostatistics, reproducibility, Mutimodal, database, HPC, automated, Genomics, Genetics

Received: 20 Aug 2018; Accepted: 16 Nov 2018.

Edited by:

Sook-Lei Liew, University of Southern California, United States

Reviewed by:

Rupert W. Overall, Deutsche Zentrum für Neurodegenerative Erkrankungen, Helmholtz-Gemeinschaft Deutscher Forschungszentren (HZ), Germany
Vincent Frouin, Neurospin, France  

Copyright: © 2018 Das, Lecours Boucher, Rogers, Chouinard-Decorte, Oros Klein, Makowski, Beck, Rioux, Brown, Mohaddes, Zweber, Foing, Forest, O’Donnell, Clark, Greenwood, Meaney and Evans. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Mr. Xavier Lecours Boucher, McGill Centre for Integrative Neuroscience, Montreal Neurological Institute, Montreal, Quebec, Canada,