Helmholtz: a modular tool for neuroscience databases
Databasing of experimental neuroscience data together with the annotations/metadata needed to understand it promises major payoffs both for the scientists who generate the data and for the progress of neuroscience in general. However, systematically putting the annotations and other metadata into a digital form is generally an arduous task at present, and the benefits difficult to realize, so that the cost/benefit ratio for the experimentalist is a poor one, with the corollary that the flow of data shared with the broader community is more a trickle than a flood.
To improve this situation, we are developing tools that aim to make it easier for scientists to put their data into a database, to annotate it appropriately, to obtain some immediate benefit from the effort, and to share the data with others, under terms they control, if they so wish.
The tension between immediate benefit in having a better tool for scientists to manage their own data and longer-term considerations of sharing with others has several implications: 1. the metadata stored should be customizable, since the needs of different labs can vary widely, but there should also be a common core to ensure interoperability if the data are published; 2. we should support storing both raw data and processed/analyzed data, so that the scientist can manage all phases of his workflow and so that the provenance of an individual result or graph can be easily tracked; 3. the same tool should be useable both as a local database and as a public resource; 4. both data and metadata should have fine-grained and easy-to-use access controls.
We are developing an open-source tool, Helmholtz (named after the 19th century physicist and physiologist), implemented mainly as a series of "apps" (applications/components) built with the Django web framework (http://www.djangoproject.com/). The advantages of using a web framework are: 1. it makes it easy to setup either a local database (Django comes with a simple built-in web-server) or a centralised repository. 2. a highly modular structure makes the database easy to customize and extend. 3. abstraction of the underlying database layer, so that (i) any supported relational database can be used (e.g. MySQL, PostgreSQL, Oracle or the built-in SQLite); (ii) knowledge of SQL is not required, making it easier for non-database specialists to develop tools and extensions. 4. it is easy to develop multiple interfaces, e.g. a web interface, a web-services interface, interfaces to desktop acquisition or analysis software.
Helmholtz provides core components which handle things that are common to all or many domains of neuroscience: 1. data acquisition: metadata for experimental setups (equipment, etc.), subjects (species, weight, anaesthesia, surgery, etc.), stimulation and recording protocols, for electrophysiology (in vivo and in vitro), optical imaging and morphological reconstructions; 2. databasing of analysis results, linked to the original data on which they are based, and with descriptions of the analysis methods used; 3. access control;
Extension components for different sub-domains of neuroscience will gradually be developed in collaboration with experts in those domains (we currently provide only a visual system component, with functionality for describing and visualising visual stimuli, etc.) It should be straightforward for anyone with some programming experience to develop their own extension components.
The Helmholtz components could be combined with pre-existing Django components (e.g., user management, syndication using RSS/Atom, discussion forums, social-networking tools, a wiki, etc.) to develop a full-scale repository or portal.
The Helmholtz system has so far been used to develop a database of functional and structural data from visual cortex (https://www.dbunic.cnrs-gif.fr/visiondb/) within the EU FET integrated project FACETS (http://www.facets-project.org).
Neuroinformatics 2010 , Kobe, Japan, 30 Aug - 1 Sep, 2010.
(2010). Helmholtz: a modular tool for neuroscience databases.
Neuroinformatics 2010 .
10 Jun 2010;
10 Jun 2010.
Andrew Davison, CNRS, UNIC, Gif sur Yvette, France, firstname.lastname@example.org