Event Abstract

Brain Imaging Data Structure - a new standard for describing and organizing human neuroimaging data

  • 1 Stanford University, United States
  • 2 University of California, United States
  • 3 University of California, United States
  • 4 SRI International, United States
  • 5 Cambridge University, United Kingdom
  • 6 Child Mind Institute, United States
  • 7 University College London, United Kingdom
  • 8 Massachusetts Institute of Technology, United States
  • 9 University of Washington, United States
  • 10 Dartmouth College, United States
  • 11 Otto-von-Guericke-University, Germany
  • 12 University of Massachusetts, United States
  • 13 Massachusetts General Hospital, United States
  • 14 University of Warwick, United Kingdom
  • 15 Georgia State University, United States
  • 16 McGill University, Canada

Introduction: Typically the output of a human neuroimaging experiment is a complex, heterogeneous and multidimensional dataset that can be arranged and described in many different ways. So far there is no consensus on how to organize and share raw data obtained in such experiments. For example, two researchers working in the same lab can choose to organize their data in different ways, causing confusion and potentially losing (meta)data. Previously proposed solutions to this problem involved flexible, but complicated, file formats (Gadde et al., 2012) or use of specific databases (e.g., Marcus, Olsen, Ramaratnam, & Buckner, 2007); however lack of technical expertise in many neuroimaging labs makes adoption of such solutions challenging. Simpler data organizations, such as the standard used in OpenfMRI.org (Poldrack et al., 2013), lack the flexibility needed to support a wide range of experimental designs and data types. Here we introduce a simple and easy-to-adopt way to organize neuroimaging and behavioral data that facilitates sharing both within and across labs. Methods: Initial work on the standard began during a special meeting of the Neuroimaging Data Sharing Task Force (NIDASH) held at Stanford University in January 2015. Even though the group is committed to the long term support of semantic web standards such as the Neuroimaging Data Model (NIDM, Keator et al., 2013), participants acknowledged that there is a need for a simpler and more easily implemented standard for the organization of neuroimaging datasets. As a starting point, the group used the existing OpenfMRI data description standard, as it has already shown its scalability and practicality through wide adoption for many datasets. The process of drafting the specification took three more months and involved weekly telephone calls and consultations with members of the community experienced in handling different types of neuroimaging data. Results & Discussion: Within neuroimaging laboratories, it is common for metadata (such as subject ID, scan type etc.) to be encoded in the filename and folder structure. When developing the Brain Imaging Data Structure (BIDS; https://goo.gl/BOLyWR) we wanted to preserve this feature to make the structure accessible to researchers with little software expertise. In essence, BIDS describes how data should be organized into files and folders. The folder hierarchy includes subject ID, session number, modality, and imaging data are stored in the compressed NIfTI format. Key/value metadata (such as repetition time or slice timing) is stored in a JSON file. The hierarchical structure facilitates one JSON file being sufficient to describe scanning parameters for all common sessions and/or all subjects. Tabular data (variables describing each subject, events during the scan) etc. are stored in Tab Separated Value (.tsv) files. The standard covers descriptions of task and resting-state fMRI data, structural data (including, but not limited to, quantitative T1 maps), field maps, and diffusion data. The BIDS standard aims at including everything that is necessary to analyze the data given a hypothesis and thus includes population variables (e.g. age, sex or questionnaire scores) and detailed timing of stimuli presented and responses recorded in the scanner. Although BIDS requires some types data to be described in a specific way, it also accommodates extensions. Additional files collected during experiments currently not covered by the standard can be added at any level of the hierarchy. We envision that this specification will evolve through feedback from the community providing consensus for describing data types. To improve adoption we have developed a validator (https://github.com/chrisfilo/bids-validator) that checks whether the data conforms with the BIDS specification. We hope that the validator will make it easier for new users to implement BIDS in their labs. We also hope that data analysis tools and online repositories will adopt BIDS as an input (tools and repositories) and output(repositories) data layout, enabling users to apply analysis pipelines that are being developed for data formatted according to the standard. For the time being, the standard describes the data and a rich collection of relevant variables. This is necessary, but not sufficient, to replicate an analysis. In the future we plan to add an extension to BIDS that will allow specifying hypotheses in the form of a relationship between measured variables and their transformations (e.g., “Does cortical thickness correlate with the square of age?”). This will allow automated analysis of the data. Future research will also involve developing a converter from BIDS to the upcoming semantic Web based NIDM-Experiment standard, linking data with provenance and ontological information. Conclusions: We introduced a new standard based on a file-system hierarchy, with files containing key metadata, to describe data collected during human neuroimaging experiments. It aims to be simple and easy to adopt as a standard lab practice that would later facilitate sharing and archiving.

Acknowledgements

We would like to acknowledge the work of all the INCF task force members as well as of many other colleagues who have helped the task force. We are particularly indebted to Mathew Abrams, Linda Lanyon, Roman Valls Guimera and Sean Hill for their support at the INCF. Further we acknowledge the long-standing support of DDWG activities by the BIRN coordinating center (NIH 1 U24 RR025736-01), and the Wellcome Trust for support of CM & TEN.

References

Gadde, S., Aucoin, N., Grethe, J. S., Keator, D. B., Marcus, D. S., and Pieper, S. (2012). XCEDE: An Extensible Schema for Biomedical Data. Neuroinformatics 10, 19–32. doi:10.1007/s12021-011-9119-9.
Keator, D. B., Helmer, K., Steffener, J., Turner, J. a, Van Erp, T. G. M., Gadde, S., Ashish, N., Burns, G. a, and Nichols, B. N. (2013). Towards structured sharing of raw and derived neuroimaging data across existing resources. Neuroimage 82, 647–661. doi:10.1016/j.neuroimage.2013.05.094.
Marcus, D. S., Olsen, T. R., Ramaratnam, M., and Buckner, R. L. (2007). The extensible neuroimaging archive toolkit. Neuroinformatics 5, 11–33. doi:10.1385/NI:5:1:11.
Poldrack, R. A., Barch, D. M., Mitchell, J. P., Wager, T. D., Wagner, A. D., Devlin, J. T., Cumba, C., Koyejo, O., and Milham, M. P. (2013). Toward open sharing of task-based fMRI data: the OpenfMRI project. Front. Neuroinform. 7, 1–12. doi:10.3389/fninf.2013.00012.

Keywords: Neuroimaging, Standards, data sharing, fMRI, MRI

Conference: Neuroinformatics 2015, Cairns, Australia, 20 Aug - 22 Aug, 2015.

Presentation Type: Poster, to be considered for oral presentation

Topic: Neuroimaging

Citation: Gorgolewski KJ, Poline J, Keator DB, Nichols BN, Auer T, Craddock RC, Flandin G, Ghosh SS, Sochat VV, Rokem A, Halchenko YO, Hanke M, Haselgrove C, Helmer K, Maumet C, Nichols TE, Turner JA, Das S, Kennedy DN and Poldrack RA (2015). Brain Imaging Data Structure - a new standard for describing and organizing human neuroimaging data. Front. Neurosci. Conference Abstract: Neuroinformatics 2015. doi: 10.3389/conf.fnins.2015.91.00056

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 01 May 2015; Published Online: 05 Aug 2015.

* Correspondence: Dr. Krzysztof J Gorgolewski, Stanford University, Stanford, United States, krzysztof.gorgolewski@gmail.com