Event Abstract

HID-Genetics: A Federated BIRN-enabled Data Management System for Clinical, Imaging, and Genome-Wide Association Studies

  • 1 University of California, Irvine, United States

In the past 20 years, enormous strides have been made in imaging the human brain’s structure and function. Equal steps have been made in understanding the human genome and its role in disease. Complex behavioral and neurodevelopmental/degenerative diseases such as schizophrenia and Alzheimer’s disease appear to involve the combined effects of multiple genes and important interactions with the external and internal environment. Given the known importance of both genetics and environment in brain function, and the role of neuroimaging in revealing brain dysfunction, the capability to integrate genetics with brain imaging data in a single data resource is needed. Currently, there are no open-source data management systems that support federated storage and retrieval of neuroimaging, clinical, and genetics data. The HID-Genetics component of the Human Imaging Database (HID; Ozyurt 2010) bridges the gap between support for federated neuroimaging and clinical data already included in the HID and the genetics data and annotation support needed for today’s imaging-genetics association studies.    The HID is an open-source, extensible database schema and associated three-tier J2EE application environment for the storage and retrieval of biomedical data designed to operate in a federated database environment. The database contains an extensible framework for the definition and storage of clinical assessment and demographic data. The HID environment also contains a 1) intuitive web based user interface that can be used for the entry and management of subject’s data. A core component of this interface involves the management of behavioral and/or clinical data that uses modules which streamline the development of on-line forms for entry and maintenance of large numbers of measures; and 2) A data integration engine that builds on top of the BIRN data integration environment allowing multiple sites running the HID to create a federated database so that these sites can be queried as a single database resources from the web based user interface.  To enable storage of genotype data in a federated environment and to integrate the genetics data with extensive clinical and imaging data collected on the same individuals, the HID-Genetics extensions have been designed. The core HID implementation has been augmented to include additional tables for storing Single Neucleotide Polymorphism (SNP) RS number, strand information, chromosome, GC score, allele and additional metrics and metadata from the genotyping platform. The system supports simple summary statistics to help check quality of the genetics data. Additional genetics annotation support includes the ability to import genetics annotation data from the genotyping platform and online sources such as UCSC genome browser (genome.ucsc.edu; Fujita 2011). We are developing capabilities for integration of useful external data, such as genetic annotation data from sources such as the UCSC Genome Browser and others, based on various modalities including (i) Direct (remote) database access, and (ii) Local materialization. Modeled after SNPLims data management system for genome wide association studies (Orro 2008), the HID-Genetics component allows for import of multiple common genetics data formats and export of data to files in format expected by analysis tools such as Plink and EIGENSTRAT. The existing HID query interface has been redesigned to support genetics based queries, filtering of human subject imaging data by genotype, and to provide the user with an intuitive interface for constructing queries across these different data types.        The system is being tested and evaluated using the Function Biomedical Informatics Research Network (FBIRN) test-bed.  The FBIRN consortium is actively collecting clinical, imaging, and genetics data across ten geographically distributed sites. It provides a unique environment to test and evaluate the performance and design of the HID-Genetics components.  This work was supported in part by the NIH through the following NCRR grant: the Biomedical Informatics Research Network (1 U24 RR025736-01). Ozyurt I.B., Keator D., Wei D., Fennema-Notestine C., Pease K., Bockholt B., Grethe J. Federated Web-accessible Clinical Data Management within an Extensible NeuroImaging Database. Neuroinformatics. 2010;23(1):98-106. Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, Diekhans M, Dreszer TR, Giardine BM, Harte RA, Hillman-Jackson J, Hsu F, Kirkup V, Kuhn RM, Learned K, Li CH, Meyer LR, Pohl A, Raney BJ, Rosenbloom KR, Smith KE, Haussler D, Kent WJ. The UCSC Genome Browser database: update 2011. Nucleic Acids Res. 2011 Jan;39(Database issue):D876-82. Epub 2010 Oct 18. Orro A., Guffanti G., Salvi E., Macciardi F., Milanesi L. SNPLims: a data management system for genome wide association studies.  BMC Bioinformatics. 2008;9 Suppl 2: S13.

Keywords: Genomics and genetics, Infrastructural and portal services

Conference: 4th INCF Congress of Neuroinformatics, Boston, United States, 4 Sep - 6 Sep, 2011.

Presentation Type: Demo Presentation

Topic: Genomics and genetics

Citation: Keator D, Chen J, Ashish N, Torri F, Lakatos A, Potkin S, Macciardi F and Wei D (2011). HID-Genetics: A Federated BIRN-enabled Data Management System for Clinical, Imaging, and Genome-Wide Association Studies. Front. Neuroinform. Conference Abstract: 4th INCF Congress of Neuroinformatics. doi: 10.3389/conf.fninf.2011.08.00123

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 17 Oct 2011; Published Online: 19 Oct 2011.

* Correspondence: Dr. David Keator, University of California, Irvine, Irvine, United States, dbkeator@uci.edu