From Integration to Visualization of Multisite Brain Data on Brain-CODE
Francis
Jeanson1*,
Christina
Popovich1,
Brendan
Behan1,
Rachad
El-Badrawi2,
Stephen
C.
Strother3, 4, 5,
Tom
Gee3,
Fan
Dong3,
Stephen
Arnott3,
Moyez
Dharsee2,
Mojib
Javadi2,
Costa
Dafnas6 and
Chris
McPhee6
-
1
Ontario Brain Institute, Canada
-
2
Indoc Research, Canada
-
3
Rotman Research Institute, Canada
-
4
Baycrest & Medical Biophysics, Canada
-
5
University of Toronto, Canada
-
6
Centre for Advanced Computing, Canada
Brain-CODE is Ontario Brain Institute’s (OBI) extensible large-scale neuroinformatics platform that manages the acquisition, storage, sharing, and analytics of multi-dimensional data [1]. Brain-CODE is currently used as a central platform for collecting data from OBI’s Integrated Discovery Programs (IDPs) that spans 5 major brain disorders including neurodevelopmental disorders (POND), cerebral-palsy (CP-NET), epilepsy (EpLink), depression (CAN-BIND), and neurodegeneration (ONDRI) [2]. With over 400 researchers and research staff using the platform, Brain-CODE hosts a wide array of data spanning clinical research data, neuroimaging data, molecular data, as well as many other data modalities from over 5000 participants. Figure 1 illustrates OBI's integrated research approach.
Data Integration
Insights from combined integrated data can potentially lead to unprecedented opportunities to uncover mechanisms of brain disorders.
Brain-CODE adopts a multi-stage approach to achieve the integration of data and maximize its use across brain disorders. These stages are:
1. Modality-specific data capture software
2. Standardized procedures and assessments: clinical common data elements, standardized imaging requirements, etc.
3. Baseline quality assurance and quality control
4. Combining multi-modal datatypes
Modality-specific data capture tools are the first stage of integration as they provide a common way to organize electronic study forms and projects across multiple sites. Overall, Brain-CODE adopts web-based applications to facilitate the centralization and accessibility of these tools. For clinical assessments and other form-based data, REDCap [4] and OpenClinica [5] have been adopted, XNAT [6] was adapted to a version called SPReD [7] for imaging, and LabKey [8] was adopted for molecular data collection.
Standardization of variables and experimental processes is the next stage to achieve effective integration of research data. Based on existing standards, and working with the research community through a Delphi consensus process, OBI has established a core set of Common Data Elements (CDEs) to be used by all human studies [9]. More recently, OBI has also defined a base set of experimental parameters to be adopted across all EEG studies, sites, and brain disorders. Genomic standardization is the next phase where participation in international efforts such as the Global Alliance for Genomics and Health (GA4GH) [11] will help identify standards for greater interoperability and analytic potential.
Baseline quality assurances and quality controls validation techniques are essential for ensuring that data are collected with well calibrated instruments and verify that output files meet minimum standards such as: subject, variable, file, or project naming; acquisition sequence naming and protocols; output from quality control pipelines; clinical field score validation; case report form logic consistency; flagging outlier values and missing values; etc.
The final stage of data integration on Brain-CODE involves combining datasets from various modalities. Current transactional tools used for clinical data capture are usually based on relational databases, but also include data models represented as XML schemas and flat files. The integration and federation subsystem within the Brain-CODE platform combines datasets from various modalities to produce unified views and stage data for multi-variate analyses. The integration/federation pipeline starts by creating a transitional read-only data store from an enterprise-level data gathering process, followed by selective restructuring of some data components. The IBM InfoSphere Federation Server [12] was adopted to provide a thin, but crucial, virtual data definition layer that allows seamless communication with data sources irrespective of their vendor and format. An API based on backend technologies relies on this federation capability to provide subject-oriented, de-normalized mart-like data tables, within a DB2 database environment, thus laying the grounds for the various analytics and reporting tools. Figure 2 illustrates data integration on Brain-CODE.
Overall, integration plays an essential role in maximizing the value of data, but such processes must also ensure that data can be easily accessed, probed and analysed. In the following we describe the value of adopting visualization tools for initiating the insight process from the integrated data.
Visual Exploration
One of the major challenges for Brain-CODE researchers and research administrators is the monitoring of studies as research is distributed across a multitude of sites with independent schedules and priorities. To help facilitate this, OBI worked with researchers to developed visual data dashboards based on Spotfire by Tibco [13]. Spotfire provides an excellent solution to connect to the IBM InfoSphere Federation Server which retrieves the integrated query results from the federated databases on Brain-CODE. Furthermore, Spotfire enables the creation of a web-based visualization widget that dynamically loads data upon update and allows for user interactions for operations such as item selection, filtering, querying, etc.
A general view dashboard capturing the overall data collection status of all OBI-funded programs was created to enable general tracking of study progress. Valuable information signalling the progress of programs related to participant recruitment, challenges with study administration or data upload can be detected on a single view with key metrics. In addition, dashboards were created to support research administrators to review the overall progress of data collection for individual programs or studies. Typical measures are captured such as the number of participants recruited per research site, basic demographics, number of data records per modality, etc. These dashboards permit ongoing quality control and communication between sites and play an increasingly important part in the research process. Figure 3 illustrates an example visual dashboard for study data overview.
Conclusion
Overall the challenge of building a large-scale platform that collects a broad spectrum of data types from multiple domains or disorders requires a combination of robust operational management, effective data technologies, and researcher engagement. Already, OBI IDPs, such as the Canadian Biomarker Integration Network for Depression (CAN-BIND), are gaining value from the data integration and visualization available on Brain-CODE for early data monitoring and research planning [14]. Finally, by adopting international standards, Brain-CODE will participate in a rich ecosystem with other data partners for greater insights into brain and health.
Acknowledgements
We would like to acknowledge the Government of Ontario, the Integrated Discovery Program researchers for their ongoing feedback, as well as the Indoc Consortium for their work and support on the Brain-CODE platform (http://www.indocresearch.org/indoc-consortium.html).
References
[1] http://dx.doi.org/10.3389/conf.fninf.2014.18.00018, accessed 26/04/2016
[2] http://www.braininstitute.ca/research, accessed 26/04/2016
[3] http://www.privacybydesign.ca, accessed 26/04/2016
[4] http://project-redcap.org, accessed 26/04/2016
[5] https://www.openclinica.com/, accessed 26/04/2016
[6] http://www.xnat.org/, accessed 26/04/2016
[7] https://sites.google.com/a/research.baycrest.org/informatics/spred, accessed 26/04/2016
[8] https://www.labkey.com/, accessed 26/04/2016
[9] https://braincode.ca/content/standards, accessed 26/04/2016
[10] http://ondri.ca, accessed 26/04/2016
[11] https://genomicsandhealth.org/, accessed 26/04/2016
[12] http://www-03.ibm.com/software/products/en/ibminfofedeserv, accessed 26/04/2016
[13] http://spotfire.tibco.com/, accessed 26/04/2016
[14] http://dx.doi.org/10.1186/s12888-016-0785-x, accessed 26/04/2016
Keywords:
multi-site study,
data integration,
visualization,
Software,
Dashboards
Conference:
Neuroinformatics 2016, Reading, United Kingdom, 3 Sep - 4 Sep, 2016.
Presentation Type:
Poster
Topic:
Visualization
Citation:
Jeanson
F,
Popovich
C,
Behan
B,
El-Badrawi
R,
Strother
SC,
Gee
T,
Dong
F,
Arnott
S,
Dharsee
M,
Javadi
M,
Dafnas
C and
McPhee
C
(2016). From Integration to Visualization of Multisite Brain Data on Brain-CODE.
Front. Neuroinform.
Conference Abstract:
Neuroinformatics 2016.
doi: 10.3389/conf.fninf.2016.20.00052
Copyright:
The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers.
They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.
The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.
Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.
For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.
Received:
30 Apr 2016;
Published Online:
18 Jul 2016.
*
Correspondence:
PhD. Francis Jeanson, Ontario Brain Institute, Toronto, Canada, fjeanson@yahoo.com