Event Abstract

An architecture for GRID-based analysis of Neuroimaging data using relational databases and the SWIFT workflow engine.

  • 1 The University of Chicago, United States

Neuroimaging research imposes substantial demands on computational infrastructures. These infrastructures need to support management of massive amounts of data while affording rapid analysis, easy access to highly specific subsets of data, and secure remote access for collaborators. We have recently described an architecture to achieve these goals, which relies on distributed database management systems [1]. A central component of this system relies on DBMSs to store neuroimaging data and metadata in a relational database. This allows for extraction of highly specific subsets of data via SQL queries, which can be readily analyzed via GRID accessible computing nodes. Here we present two recent upgrades to this system: (a) an extension of the architecture to use the SWIFT workflow specification system [2] to facilitate analysis of DBMS-stored neuroimaging data on GRID sites (e.g., TeraGrid), and (b) an interface between this system and several commonly used neuroimaging tools (AFNI, SUMA) that allows for immediate graphic depiction of the results of database queries.

Because the system stores neuroimaging data in DBMSs, these data can be accessed via standard Internet Protocols, thus enabling distributed analysis of remotely stored data. GRID-based computing can leverage this capacity for parallel computing as GRID sites host tens of thousands of computing nodes. This enormous computational scale requires special mechanisms to identify and submit jobs to GRID sites, establish provenance for each analysis, and return the results to the user. We will demonstrate how SWIFT achieves these goals in the context of distributed analysis of neuroimaging time-series analysis. In particular, a researcher can use SWIFT to issue multiple database queries from multiple GRID sites, extract time-series of interest, analyze those data, and retrieve the results.

For neuroscientists, the information that results from such analyses is often most meaningful when displayed graphically in the form of brain activation maps. In collaboration with the AFNI development group, we have recently established mechanisms for plotting the results of SQL queries using the AFNI viewer. We will show how this interface allows researchers to visualize highly specific subsets of data without needing to retrieve the entire dataset.

By way of a concrete example, suppose a scientist wishes to establish the impact of a certain filtering parameter on the results of a neuroimaging time series analysis. The scientist might establish a parameter sweep, where the effects of various filter combinations are assessed. Individually, each job (defined as a certain filter specification) might have a processing time of 2 hours, and there are 50 filter combinations. Using SWIFT, each job is submitted to a computing node located at a Grid sites. The nodes retrieve data from a database via SQL queries, process it using the assigned filtering parameters, plot the results graphically on a cortical (2D) representation, and save the result as an image. The result images are then returned to the researcher via SWIFT mechanisms.

References

1. Hasson et al. (2008). NeuroImage, 39, 693-706

2. Zhao et al. (2007). IEEE International Workshop on Scientific Workflows

Conference: Neuroinformatics 2008, Stockholm, Sweden, 7 Sep - 9 Sep, 2008.

Presentation Type: Oral Presentation

Topic: Live Demonstrations

Citation: Hasson U, Andric M, Kenny S, Wilde M and Small S (2008). An architecture for GRID-based analysis of Neuroimaging data using relational databases and the SWIFT workflow engine.. Front. Neuroinform. Conference Abstract: Neuroinformatics 2008. doi: 10.3389/conf.neuro.11.2008.01.118

Received: 25 Jul 2008; Published Online: 25 Jul 2008.

* Correspondence: Uri Hasson, The University of Chicago, Chicago, United States, uhasson@uchicago.edu

© 2007 - 2017 Frontiers Media S.A. All Rights Reserved