Impact Factor 3.517 | CiteScore 3.60
More on impact ›

Technology and Code ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Genet. | doi: 10.3389/fgene.2019.00963

A generic multivariate framework for the integration of microbiome longitudinal studies with other data types

  • 1Universite Laval, Canada
  • 2IRSTEA, France
  • 3The University of Melbourne, Australia

Simultaneous profiling of biospecimens using different technological platforms enables the study of many data types, encompassing microbial communities, omics and meta-omics as well as clinical or chemistry variables. Reduction in costs now enables longitudinal or time course studies on the same biological material or system. The overall aim of such studies is to investigate relationships between these longitudinal measures in a holistic manner to further decipher the link between molecular mechanisms and microbial community structures, or host-microbiota interactions. However, analytical frameworks enabling an integrated analysis between microbial communities and other types of biological, clinical or phenotypic data are still in their infancy. The challenges include few time points that may be unevenly spaced and unmatched between different data types, a small number of unique individual biospecimens and high individual variability. Those challenges are further exacerbated by the inherent characteristics of microbial communities-derived data (e.g. sparsity, compositional).

We propose a generic data-driven framework to integrate different types of longitudinal data measured on the same biological specimens with microbial communities data, and select key temporal features with strong associations within the same sample group. The framework ranges from filtering and modelling, to integration using smoothing splines and multivariate dimension reduction methods to address some of the analytical challenges of microbiome-derived data. We illustrate our framework on different types of multi-omics case studies in bioreactor experiments as well as human studies.

Keywords: Time course, data integration, Splines, Feature Selection, clustering

Received: 04 Apr 2019; Accepted: 10 Sep 2019.

Copyright: © 2019 Bodein, Chapleur, Droit and Le Cao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Dr. Kim-Anh Le Cao, The University of Melbourne, Melbourne, Australia, kimanh.lecao@unimelb.edu.au