Mapping neuroimaging resources into the NIDASH Data Model for federated information retrieval
University of Washington, USA
Columbia University, USA
University of Massachusetts Medical School, USA
University of California, Irvine, USA
University of California, San Diego, USA
University of California, Berkeley, USA
Massachusetts Institute of Technology, USA
The astounding influx of human brain imaging data makes data annotation and sharing an essential aspect of modern neuroimaging research. However, no neuroimaging data exchange standard exists that makes consuming and publishing shared neuroimaging data simple and meaningful to researchers. In this work, we use the NIDASH Data Model (NI-DM; ), a neuroimaging domain specific extension to the W3C PROV Data Model , to create NI-DM Object Models that represent neuroimaging resources from the general context of provenance information. NI-DM is a key component of an effort to build a larger Semantic Web and Linked Data framework for the generation, storage and query of persistent brain imaging data (and associated metadata) in the context of existing ontologies.
We developed NI-DM Object Models to integrate three common brain imaging data modeling patterns: 1) database schemas, 2) standard directory structures, and 3) csv/text files (Figure 1). The ADHD200 (973 participants) dataset was downloaded from the NITRC Image Repository , an XNAT database . The T1 weighted anatomical scans for each participant were processed using the ‘recon-all’ tool from FreeSurfer (FS) Version 5.1 , and additional phenotypic data was downloaded as a CSV file from NITRC. A NI-DM Object Model was then constructed for each information type.
Three deliverables resulted from this effort. First, we defined NI-DM Object Models that represent information derived from the XNAT database schema, the FS standard subject directory structure, and the contents of FS statistics files (i.e., csv/text files). These Object Models were expressed in a set of IPython Notebooks to demonstrate the encoding process . Second, the ADHD200 dataset was used to instantiate a Linked Data/RDF [7, 8] representation of the NI-DM Object types, each of which was uploaded into an RDF database (Figure 1). This representation is designed to capture data, associated metadata and provenance to allow for distributed storage and federated query. Third, we developed several queries in SPARQL , the query language for Linked Data, to evaluate the information retrieval capabilities of NI-DM. Two types of queries were successfully implemented, single data source and multi-data source federated queries . Using these queries, we were able to successfully federate data sources and retrieve 1) participant demographics, 2) file resources and 3) anatomical statistics.
The work presented here is being performed in the context of many other related efforts in defining a terminology for brain imaging and creating ontologies that capture relationships in these vocabularies. By leveraging RDF we broaden the range of biomedical information resources included in the Linked Data enterprise including existing services and libraries that can simplify query generation and speed-up response times. We believe this distributed model will show its usefulness before being fully adopted by the community. We have focused here on demonstrating the utility of NI-DM in the context of brain imaging, particularly in the representation of data processed by FS, but the benefits of the data model will grow as more brain imaging object models are designed for additional analysis packages (e.g., FSL, SPM) and derived datatypes.
This work was conducted with the Neuroimaging Task Force of the INCF Program on Standards for Data Sharing and the BIRN derived-data working group.
 NIDASH: http://nidm.nidash.org
 PROV-DM: http://www.w3.org/TR/prov-dm
 NITRC-IR: http://www.nitrc.org/ir
 Marcus DS, Olsen TR, Ramaratnam M, Buckner RL. 2007. The extensible neuroimaging archive toolkit. Neuroinformatics 5(1):11-33. DOI: 10.1385/NI:5:1:11
 Fischl B. 2012. FreeSurfer. 62(2):774-781. DOI: 10.1016/j.neuroimage.2012.01.021
 NIDASH notebooks: https://github.com/ni-/notebooks
 Linked Data: http://www.w3.org/standards/semanticweb/data
 Resource Description Framework (RDF): http://www.w3.org/RDF
 SPARQL: http://www.w3.org/TR/rdf-sparql-query
 SPARQL 1.1 Federated Query: http://www.w3.org/TR/sparql11-federated-query
Neuroinformatics 2013, Stockholm, Sweden, 27 Aug - 29 Aug, 2013.
(2013). Mapping neuroimaging resources into the NIDASH Data Model for federated information retrieval.
29 Apr 2013;
11 Jul 2013.
Mr. B. Nolan Nichols, University of Washington, Seattle, USA, firstname.lastname@example.org