Application of Neuroanatomical Ontologies for Neuroimaging Data Annotation

The annotation of functional neuroimaging results for data sharing and re-use is particularly challenging, due to the diversity of terminologies of neuroanatomical structures and cortical parcellation schemes. To address this challenge, we extended the Foundational Model of Anatomy Ontology (FMA) to include cytoarchitectural, Brodmann area labels, and a morphological cortical labeling scheme (e.g., the part of Brodmann area 6 in the left precentral gyrus). This representation was also used to augment the neuroanatomical axis of RadLex, the ontology for clinical imaging. The resulting neuroanatomical ontology contains explicit relationships indicating which brain regions are “part of” which other regions, across cytoarchitectural and morphological labeling schemas. We annotated a large functional neuroimaging dataset with terms from the ontology and applied a reasoning engine to analyze this dataset in conjunction with the ontology, and achieved successful inferences from the most specific level (e.g., how many subjects showed activation in a subpart of the middle frontal gyrus) to more general (how many activations were found in areas connected via a known white matter tract?). In summary, we have produced a neuroanatomical ontology that harmonizes several different terminologies of neuroanatomical structures and cortical parcellation schemes. This neuroanatomical ontology is publicly available as a view of FMA at the Bioportal website1. The ontological encoding of anatomic knowledge can be exploited by computer reasoning engines to make inferences about neuroanatomical relationships described in imaging datasets using different terminologies. This approach could ultimately enable knowledge discovery from large, distributed fMRI studies or medical record mining.

in larger datasets. What looks like irrelevant individual variation in a single patient's dataset may be a reliable diagnostic or prognostic characteristic of a clinical subset identified in a larger, national, or international sample.
Integrating results across multiple neuroimaging studies would be enhanced at least in part if findings in images were annotated using a consistent nomenclature. Meta-analyses of published loci of fMRI activation can be more powerful when the activation loci included in the meta-analysis are accompanied by anatomical labels (as reviewed by Costafreda 2009). Many though not all neuroimaging datasets are currently annotated using one of several available labeling systems, such as the Talairach Atlas (Talairach and Tournoux, 1988), Montreal Neurological Institute (MNI) atlas (as implemented in Tzourio-Mazoyer et al., 2002), or other atlases available in software such as FreeSurfer (Fischl et al., 2004;Desikan et al., 2006). Each of these labeling systems provides a neuroanatomical label drawn from their particular nomenclatures for an arbitrary 3D location in their particular standardized brain space. The ability to map these labels into the same coordinate system and visualize where they overlap, as in the SumsDB system

IntroductIon
The use of functional magnetic resonance imaging in the study of both healthy and diseased brain function has resulted in a massive amount of both published literature and raw data. The National Institute of Health has recognized that data can be re-used and re-analyzed, and has mandated that studies with funding levels over $500,000 share their data with the research public 2 ). The functional neuroimaging community has been aware for some time of the value of sharing data for re-use, as implemented first in the fMRI Data Center 3 (Van Horn and Ishai, 2007) and more recently in the BIRN Data Repository (Keator et al., 2008), as well as international projects such as PsyGrid (Ainsworth et al., 2006), Imagen (Thyreau et al., 2009), and BaxGrid (Nakai et al., 2008). Simultaneously, the interest in pooling information from electronic health records for data mining and re-analysis has grown, with the goal of discovering the subtle patterns that can only be identified Application of neuroanatomical ontologies for neuroimaging data annotation (Van Essen and Dierker, 2007), highlights the disagreements both in nomenclature and in localization across these atlases. Nomenclature disagreements might include, for example, whether the concept of the dorsolateral prefrontal cortex (DLPFC) should be included as a label, or whether the anterior and posterior cingulate gyrus can be included as separate terms. Localization disagreements include for example whether the same 3D coordinate would be identified as the precentral gyrus in two different atlases (as quantified for a number of atlases in Bohland et al., 2009), and will occur even if the nomenclature is standardized. Both nomenclature and localization methods need to be standardized to enhance neuroimaging data sharing and re-use. The problem of a universal coordinate system for neuroimaging localization is still being addressed, for example, to ensure that neuroimaging results across different studies are being coregistered to the same space in comparable and consistent ways, given differences in brain shape among human subjects and templates (Derrfuss and Mar, 2009).
The issues of nomenclature and proper labeling are not the only concerns in re-use of neuroimaging data and results. The wealth of knowledge regarding relationships among anatomical structures is not captured simply in a nomenclature, or even a standardized hierarchy of which structures comprise or are "part of " other structures. The ability to represent which areas project to others, which structures develop from others, or which regions are partially overlapping, can be represented in a formal ontology. An ontology as we use the term here is a formal representation of entities and the relationships between these entities used in a domain of knowledge. The advantage of ontologies is that they make explicit the relationships among the entities being represented as types in a given domain. The formal representation of knowledge about a domain allows potential use by automated information retrieval and reasoning systems. We address the problem of a standardized nomenclature, the neuroanatomical terms, and their interrelationships, through the expansion of the neuroanatomical ontology module with the Foundational Model of Anatomy (Martin et al., 2003;Rosse and Mejino, 2003;Golbreich et al., 2006). We have integrated neuroanatomical terminologies based on multiple schemata into a single ontology, annotated neuroimaging results with ontological terms at the finest granularity available, and used that information to automate comparison of fMRI results. We used the Foundational Model of Anatomy Ontology (FMA) as the framework for integrating the different neuroimaging terms, then extracted these neuroanatomy terms from the FMA and incorporated them into RadLex (Langlotz, 2006), which is the primary application ontology for the domain of radiology.
The FMA is a reference ontology initially created for the domain of human anatomy, but which now can be extended to other species Mejino, 2003, 2007). FMA is designed to represent the domain of anatomy; the scope of the neuroanatomical module is the structures of the brain and their relationships. The backbone of the FMA is a taxonomy of is_a relationships which can be viewed as a graph, in which each anatomical entity is represented as a type or class that is assigned to a single parent class (following the principle of single inheritance, see, e.g., Larson and Martone, 2009). The dominant anatomical entity in the FMA is the term anatomical structure which represents an entity or a type in terms of structural properties; these properties are its inherent 3D shape and its parts which are connected and arranged in a spatial pattern generated through the coordinated expression of the organism's set of genes Mejino, 2003, 2007). The FMA represents structures at all levels of granularity, from the entire body, to organs and organ systems, to portions of tissue, down to the cells and subcellular structures. The neuroanatomical module within the FMA encompasses neuroanatomical structures such as the telencephalon of the forebrain, and its parts such as hemispheres, lobes, and gyri. Allowable relations include is_a, has_shape, has_part, contained_in, and connected_to, to name a few (a full list is provided in Rosse and Mejino, 2007). (In what follows types within the ontology are represented by font, while relationships are denoted in italics.) The FMA identifies different methods for bounding regions and subregions. "Components" are parts which have predominantly bona fide boundaries, for example, lymph node, pituitary gland, etc. Parts that have fiat or arbitrary boundaries are called "regions" which in turn are distinguished on the basis of the presence or absence of anatomical landmarks. Regions whose boundaries rely predominantly on landmarks (fixed fiat boundary) are called "segments" while regions that are not based on any landmarks but rather on highly arbitrary case by case decisions (floating fiat boundary) are called "zones". The neuroanatomical domain knowledge that the left angular gyrus and the right angular gyrus are both angular gyri within the brain, for example, is represented by the types Left precentral gyrus and Right precentral gyrus standing in is_a relationship to Precentral gyrus, which in turn is_a Gyrus, which is_a Segment of cerebral hemisphere (see Figure 1). Note that this visualization does not represent other relationships such as regional part of, connects_to or is_adjacent_to, which are also available within FMA to represent neuroanatomical knowledge. The FMA is both broader and more fine-grained than extant anatomy texts or terminologies (e.g., it has as its high level nodes such types as Material anatomical entity and Immaterial anatomical entity, and Gray matter of right precentral gyrus, which are not found in other ontologies). The most recent public release of FMA is available through the National Center for Biomedical Ontology (NCBO) Bioportal 4 .
RadLex, on the other hand, is an ontology developed by the Radiological Society of North America which is designed to represent the standard terms used in radiological imaging procedures, observations and diagnostic reports (Langlotz, 2006). For example, a report would include report components, which might include the patient identifier, the imaging technique, the observations, and include values regarding the imaging quality. RadLex is a community effort and has a formal subcommittee process for vetting changes to the ontology. It is also available for visualization and download on the NCBO Bioportal. Prior to the work reported here, neither the FMA nor RadLex incorporated the various terminologies used for annotating and labeling specifically neuroimaging data, particularly the ability to annotate using both cytoarchitectural and morphological labeling nomenclature.
The Talairach Daemon (TD) (Lancaster et al., 2000) is publicly available software which has, for every 3D location in the Talairach and Tournoux Atlas (Talairach and Tournoux, 1988 searching and pattern finding in functional imaging datasets and could enable larger analyses than those currently performed by the unassisted researcher.

MaterIals and Methods enhancIng neuroanatoMIcal content In the FMa
We have incorporated into the FMA the anatomical entities referenced by the neuroimaging terminologies used in the following major neuroscientific projects: Talairach as listed in the TD (Lancaster et al., 2000), FreeSurfer (Desikan et al., 2006), Anatomical Automatic Labeling (AAL) (Tzourio-Mazoyer et al., 2002), and NIFSTD/NeuroLex (Bug et al., 2008a,b). The full details of this work are being presented in another paper; in this particular study, we focus on how we reconciled the TD terms with the formal representation of corresponding neuroanatomical structures in the FMA. There are 1105 terms in the TD list of labels. We mapped these terms individually to the FMA and encoded the regional part of relationships between them. Many of the gyral, lobar, and hemispheric terms were already present in FMA. However, the simultaneous labeling using Brodmann nomenclature required new terms. The set of terms encoded in the TD label implicitly refers to neuroanatomical entities at varying levels of granularity. An example of the already identified a hierarchy of labels which represents an increasing granularity from the hemisphere, to the lobe, to the gyrus and finally a mapping to the Brodmann area (Brodmann, 1909(Brodmann, /1994. Thus the Daemon can apply both morphological and cytoarchitecturally defined labels; for an arbitrary 3D location, it can identify that it is in the right cerebral hemisphere, the frontal lobe, the middle frontal gyrus and in Brodmann area 6. On the other hand, the TD is a labeling atlas, not an ontology. It lacks the explicit rich anatomical relationships of the FMA; the anatomic partonomy implicit in this hierarchy of terms has not been explicitly expressed and correlated with the other terminologies and ontologies such as the FMA and RadLex. Automated reasoning involving various relations such as is_a and part_of and projects_to is enabled by incorporation into a formal ontology. The goal of this work was to incorporate terms from the cross-product of morphological and cytoarchitectural labeling schemes included in the TD for the first time into the FMA and RadLex, and to enable querying the results of neuroimaging studies using the disparate terminological schemes. Another goal of this work was to demonstrate that by incorporating these terms into the ontology, the summary analysis of imaging datasets results annotated with these terms can be automated using computer reasoning methods. This opens the way for automated machine Brodmann area type had both a right and left subtype created. To represent the part of Brodmann area 6 that is in the precentral gyrus in the is_a taxonomy, we created the type Segment of a Brodmann Area, which also is_a Region of cerebral cortex. The subclasses representing parts of the Brodmann areas can then be included, for example, Segment of Brodmann Area 6. A type that is a Segment of a Brodmann Area is a regional part of some Brodmann Area. It could also be a regional part of many other cortical regions, e.g. the frontal lobe or a particular gyrus. Thus, the Segment of Brodmann Area 6 would have a subclass such as Brodmann Area 6 of the right precentral gyrus, which is a regional part of both Right Brodmann Area 6 and the Gray matter of the right precentral gyrus (see Figures 2 and 4). levels of labeling included in the TD output and its correlation with the FMA hierarchy is shown in Figure 2B. The TD output indicating the part of Brodmann area 6 which is in the precentral gyrus of the frontal lobe of the right hemisphere is Right Cerebrum.Frontal Lobe.PreCentral Gyrus.Gray Matter.Brodmann Area 6. This labeling intuitively codes complex neuroanatomical knowledge that there is a part of Brodmann Area 6 in the right hemisphere which is part of the precentral gyrus of the right hemisphere, which is part of the frontal lobe, which is part of the right cerebral hemisphere.
Foundational Model of Anatomy Ontology already represented the knowledge that the right precentral gyrus was part of the frontal lobe in the right cerebral hemisphere. To represent the TD output fully in FMA required the addition of the Brodmann area as a type Brodmann area which is_a Region of cerebral cortex; each task twice during the scanning session and both runs of the task were analyzed together as detailed below. The final dataset included scans from 113 subjects with and 112 without schizophrenia, from eight different universities.

analysIs and labelIng technIque
Data analysis was carried out using the FBIRN Image Processing Stream (FIPS) 7 an imaging analysis tool for multi-site fMRI analysis based on FSL . Images were motion corrected using MCFLIRT (Jenkinson and Smith, 2001); slice-timing correction was applied using Fourier-space time-series phase-shifting. Then the images were skull stripped using the BET tool (Smith, 2002). The extracted brain images were smoothed spatially using a Gaussian kernel of 8 mm FWHM. Time-series statistical analysis contrasting the active (checkerboard flashing, button pressing) conditions with the fixation condition was carried out for each run using a general linear model along with the hemodynamically corrected reference paradigm . Each result was registered and normalized to a standard template from the MNI using a 12 degree of freedom affine transformation. For each subject a cross-run analysis was carried out using a standard weighted fixed effects model. Clusters of significant results for each subject were identified using a voxel threshold of z = 3, and a cluster-wise significance level of .05. The location of the maximal voxel in each significant cluster in MNI space was transformed into Talairach space (using the Matthew Brett mni2tal.m transformation 8 ). The TD then was used to label each point in Talairach space with a standard neuroanatomical label, finding the closest gray matter (as recommended by Bohland et al., 2009). Voxels whose closest gray matter label was more than 40 mm away were considered unlabeled. The resulting dataset comprised 1740 labeled locations from the 224 subjects. An example of a single subject's results and their labeled clusters are included in Figure 3.

ontologIcal applIcatIon
Each labeled cluster was assigned the FMA ID, from the expanded FMA module, which matched the TD label assigned to it. This was done manually for this example, though it has since been automated. For example, a point labeled LeftCerebrum.FrontalLobe. MiddleFrontalGyrus.GrayMatter.BrodmannArea9 from the TD output was assigned the FMA ID 271653, which has the name Brodmann area 9 of the left middle frontal gyrus. The 1740 labeled clusters are similar in form to how results are often published in the literature, such as the findings that are in the BrainMap database (Fox and Lancaster, 2002;Laird et al., 2005). We used the data annotated by the ontology to perform these analyses:

Analysis 1. Which parts of BA 6 are activated in schizophrenics (SZ) and healthy volunteers (HV)?
Do schizophrenic subjects show a different distribution of locations of activation in Brodmann area 6 than do controls? Both schizophrenics and controls show activation in left hemisphere Brodmann area 6 (BA 6), for example, during this sensorimotor task; however, With this basic framework added, incorporating the TD labeling scheme into FMA required incorporating terms for the parts of Brodmann area labels that are in various gyri, and conversely, the parts of various gyri that overlap with various Brodmann areas. In a more comprehensive report we describe similar approaches for mapping FMA entities to FreeSurfer, AAL, and NeuroLex. extractIng a neuroFMa vIew and usIng the vIew to enhance radlex After the appropriate types were added to FMA as above, we extracted a "view" of the FMA that was confined to neuroanatomy ("NeuroFMA", Shaw et al., 2008). This view was then used to reorganize the neuroanatomy component of RadLex; that is, the appropriate FMA IDs were copied into attributes for the corresponding RadLex concepts, and the mapping to alternative anatomical parcellations were copied as well (FMA-RadLex, Mejino et al., 2008). Both the NeuroFMA and FMA-Radlex were converted to OWL Full (Web Ontology Language, W3C-OWL-Working-Group, 2009) following methods described previously (Noy and Rubin, 2008), and both are available as OWL views of the relevant parent ontologies in the NCBO Bioportal 4 (Noy et al., 2009).

ontology web servIce For "IntellIgent querIes"
The availability of NeuroFMA and FMA-Radlex in OWL allowed them to be incorporated in an ontology web service developed by the University of Washington Structural Informatics Group (SIG). This service accepts queries in vSparQL (Shaw et al., 2008), an enhanced version of SparQL (SparQl Protocol and RDF Query Language 5 ). vSparQL enhances SparQL by (among other things) allowing a single "intelligent" query to follow complex pathways in the graph defined by the ontology, such as finding all subclasses of a concept, and for each of these, following the transitive closure of its parts (i.e. parts of parts). We exploit this capability and the enhanced content in the NeuroFMA and FMA-Radlex in the competency queries below.

neuroIMagIng data
The imaging data used to demonstrate the utility of our ontologies for data analysis were part of the multi-site dataset collected by the Functional Imaging Biomedical Informatics Research Network (FBIRN 6 ; Potkin et al., 2009). Eight different universities participated in the recruitment and scanning of subjects with long-term schizophrenia and age-and gender-matched healthy subjects, using a fixed protocol and cognitive experiments. Those data have been analyzed and made public and a number of the results have been published (Kim et al., 2009a,b;Potkin et al., 2009). Scanning was performed on a combination of 1.5 T and 3 T Siemens and GE scanners; scanning parameter details can be found in (Potkin et al., 2009). Parts of the dataset used in this example have been previously presented (Lee et al., 2008). In short, the task performed in the MRI scanner consisted of alternating 16-s epochs of rest (with a white cross for fixation) and an irregularly flashing black and white circular checkerboard, with subjects pressing their index finger whenever the checkerboard flashed. Subjects performed the June 2010 | Volume 4 | Article 10 | 6 Turner et al.
Neuroanatomical ontologies in neuroimaging

Analysis 2: Which parts of the precentral gyrus are active in SZ and HV in these data?
This question reverses the relationships from Analysis 1, focusing on the morphological labels rather than the Brodmann areas. While the precentral gyrus is largely identified with motor control, it includes parts of BA 3, 4, 6, 9, 43 and 44, which can play very different roles in brain function. The query for this analysis must retrieve from FMA segments of Brodmann areas which are part_of the precentral gyrus.

Analysis 3: Do the SZ subjects show more or less co-activation than healthy subjects do, in cortical regions that are connected by the Superior Longitudinal Fasciculus I (SLF1)?
This analysis draws from information not available within the TD output itself and could not be performed by knowledgeably parsing the TD output. It combines the annotated data with the knowledge Brodmann area 6 spans medial frontal gyrus, precentral gyrus, and superior frontal gyrus. We would like to know whether schizophrenics show activation in a different portion of BA 6 in this dataset than do controls (e.g., do the schizophrenic data tend to be in medial frontal gyrus while the controls are more likely to be in precentral gyrus?).
Using the TD, one could parse the output strings for these annotated clusters for "Brodmann area 6" and then for the fields preceding that string, to determine which gyrus had also been applied as a label. This relies on the "part of" coding implicit in the TD output. In testing the ontology, we use the explicit part of relationship in the ontology to identify all the segments of BA 6, and then query the dataset for those terms. The computation of whether the distribution of locations is different in the two subject populations requires a statistical test, which is not possible within the ontology itself. However, the data needed for the statistical test can be identified by finding clusters in the dataset labeled with terms which are part_of Brodmann area 6. (and machine-accessible) in the part hierarchy of the FMA (Figure 2B). The latter is the kind of information that the ontology can provide to facilitate automated reasoning in computer applications. A total of 2500 terms were added to FMA as a result of this process. No new relationships were required. The view of FMA including these terms (NeuroFMA) is available through NCBO's Bioportal 1 and the enhanced RadLex-FMA that incorporates these terms is also available as part of the RadLex ontology 9 .

ontology applIcatIon testIng
The annotation was done manually; once the matching term for a TD label such as RightCerebrum.FrontalLobe.MiddleFrontalGyrus. GrayMatter.BrodmannArea6 and its relationships were available in FMA, the FMA ID for that term was added to the clusters in that region in the larger dataset. The 1740 datapoints were labeled with 202 unique terms at the most granular level. Once the datapoints were annotated, the labels and the partonomies in the FMA were used to answer the original queries as described below.

Analysis 1. Which parts of BA 6 are activated in SZ and controls?
To complete this analysis, we compared the distribution of activation labels in the dataset. Specifically, we need to query the ontology to determine which subregions make up Brodmann area 6. DXBrain (Detwiler et al., 2009) is a lightweight distributed query-based data integration system that allows any data source to be included as long as it is available as a document on the web or as a web service, and as long as it returns XML. DXBrain allows the user to compose a distributed XQuery to any number of such sources (XQuery being the W3C recommended query language for XML); the processor sends subqueries to each source and then packages the results as a single XML document. The resulting document can then be viewed in multiple ways, including a 3-D brain surface visualization if three dimensional coordinates are available.
In the current situation there were two sources that DXBrain queried: (1) an XML document generated from the spreadsheet ( Figure 3B) containing the fMRI summary data, including the 1740 FMAIDs and an indication of whether the cluster was from a normal or SZ subject; and (2) the ontology web service, which can access the NeuroFMA, the full FMA or FMA-RADLEX.
representations within FMA of which areas the SLF 1 projects from and projects to, and the regional relationships between those areas and the areas labeled in the data.
The queries were performed using the DXBrain software (Detwiler et al., 2009; see description below) that directly answered the questions by accessing the FMA ontology and displayed the results in a tabulated form. These queries can be generalized to many others, regarding different divisions or subdivisions of the cortex depending on the interest of the researcher. results enhancIng neuroanatoMIcal content In FMa Figure 2 shows an example TD label and its ontological representation in the FMA. To continue with that example, the gray matter of the right precentral gyrus includes parts of Brodmann areas 3, 4, 6, 9, 13, 43 and 44. Brodmann area 6, on the other hand, includes parts of the precentral gyrus, the superior, the middle, the inferior and the medial frontal gyri. New classes were therefore created in the FMA to accommodate the overlapping regions between the different Brodmann areas and the different gyri. Hence, the Gray matter of precentral gyrus has_regional_part Brodmann area 3 of precentral gyrus, Brodmann area 4 of precentral gyrus, Brodmann area 6 of precentral gyrus, Brodmann area 9 of precentral gyrus, Brodmann area 13 of precentral gyrus, Brodmann area 43 of precentral gyrus and Brodmann area 44 of precentral gyrus; and conversely, Brodmann area 6 has_regional_part Brodmann area 6 of precentral gyrus, Brodmann area 6 of inferior frontal gyrus, Brodmann area 6 of middle frontal gyrus, Brodmann area 6 of superior frontal gyrus and Brodmann area 6 of medial frontal gyrus (see Figure 4).
We then mapped the Talairach term Right Cerebrum.Frontal Lobe.PreCentral Gyrus.Gray Matter.Brodmann area 6 to the new FMA class called Brodmann area 6 of right precentral gyrus which is_a Segment of Brodmann area 6 and a regional_part_of both Brodmann area 6 and Gray matter of right precentral gyrus ( Figure 2B). The part_of relationship is a superproperty of regional_part_of (e.g., if A regional part of B then A part of B).
Following the transitive part_of relation of Brodmann area 6 of right precentral gyrus up the FMA partonomy reveals that all the granularity levels implicitly stated in the Talairach label are explicitly represented Neuroanatomical ontologies in neuroimaging terms in the two diagnostic groups. In the previous example, portions of a cytoarchictecturally defined Brodmann region that extended into other sulci and gyri had to be identified; in this case, the particular subregions of the precentral gyrus were identified from within the ontology, then queried for in the annotated dataset. The analysis then proceeds by determining how many times each of these FMAIDs for these subregions of the precentral gyrus are found per diagnostic category in the activation cluster instances from the dataset.
Results. There were a total of 157 instances found with labels identifying them as being within the precentral gyrus, in left and right BA 4, BA 6, left BA 43, and left and right BA 44. Of these, 77 were from the healthy controls, 80 were from the SZ participants, and the distribution across regions was not significantly different in the two groups (χ 2 (6, N = 157) = 9.87, p < 0.130).

Analysis 3: Do the SZ subjects show more or less co-activation than healthy subjects do, in cortical regions that are connected by the Superior Longitudinal Fasciculus I (SLF1)?
Approach. The relationships projects_to and projects_from in FMA are used to represent neuroanatomical consensus regarding both the connections between regions and their direction along a white matter tract. We queried the ontology for which cortical regions were in a projects to or projects from relationship with other types along the SLF1. Using that list of terms, we could then query as in the above examples.
Results. In the FMA, the SLF1 is with the dorsal segment of the SLF (which also includes a ventral segment, a main segment, and the arcuate fasciculus which were not considered here). The SLF1 projects from the following areas: Brodmann area 5 of superior parietal lobule, Brodmann area 5 of posterior segment of paracentral lobule, Brodmann area 5 of postcentral gyrus, Approach. The text below shows a portion of the query pulling in the RDF schema and the OWL representations of FMA to identify terms for regional parts of Brodmann Area 6. The full version of this query, which can be run by the interested user, is at the SIG website 10 . The distributed XQuery first issues a subquery written in vSparQL to the ontology web service containing the FMA; it recursively gathers the list of the parts of Brodmann area 6. From the activation dataset, the query computes some simple statistics, for both healthy and schizophrenic subjects, of the distribution of activation sites across the frontal lobe parts returned by the ontology subquery. The counts for the individual structures are summarized in the returned results.
The following terms were identified as a regional_part_of Brodmann area 6: This search of the ontology thus returns the FMAIDs associated with each subregion of BA 6. The analysis proceeds by determining how many times each of these FMAIDs are found per diagnostic category in the activation cluster instances from the dataset.
Results. In the 1740 instances included in the dataset, there were 212 instances spread across the different regional parts of BA 6; 105 from the controls and 107 from the SZ subjects, as shown in Table 1. The distribution of cases and controls over the parts of Brodmann area 6 was not significantly different (χ 2 (7, N = 212) = 5.73, p < 0.571). We verified this result in these data, by looking at the overlapping loci in the original dataset.

Analysis 2: Which parts of the precentral gyrus are active in SZ and HV in these data?
Approach. Following the same method as above, we first identified instances labeled with terms which were part_of the precentral gyrus, then determined whether there was a difference in the distribution of about a given domain. It is the representation of knowledge that already exists, rather than the discovery of new relationships. The extension of the neuroanatomical module of FMA with this fuller formalization of the anatomical labeling schemes begins to address these needs, by making explicit some of the relationships between the neuroanatomical entities referenced by the different nomenclatures, and providing it in a human-and computer-readable format. The ontology can then be used as a standard terminology for annotating neuroimaging data. With the ontology in place, the questions posed in the examples could be answered with a single query, rather than generating a query for every subpart of the various regions under different naming schemes. Given we had access to the original imaging data in these examples, we could have done our analysis using standard neuroimaging analyses without the ontological annotation; however, our methods are directed toward summarized, standardized results, as might be found in clinical reports of fMRI scans or from research databases (such as Bockholt et al., 2009or Keator et al., 2008. The application of this work is not in mega-analyses of raw fMRI data per se (in the sense of Costafreda, 2009), but in the mining of processed and annotated imaging data sets, or in mining published findings such as the BrainMap repository (Fox and Lancaster, 2002) or the PubBrain project (www.pubbrain.org). The BrainMap project enforces comparability across their repository by only collecting published results that are reported using 3D coordinates in the MNI space or the Talairach-Tournoux space, while the PubBrain project aims to represent more generic findings from the literature. Both projects represent information about neuroimaging experiments and the subsequent brain areas of significant activation. The neuroanatomical ontologies developed here form a foundation for querying and retrieval across these kinds of repositories, as demonstrated through the DXBrain querying reported above and elsewhere (Detwiler et al., 2009).
A concern in the application of these annotation techniques is in what, exactly, is being annotated. It is common for only the voxel with the largest effect size in the statistical analyses to be reported (the maximal voxel, or loci of maximal activation), and that is the approach we took in this case; the maximal voxel may or may not be the most representative piece of information. The statistical analyses explicitly look for clusters of contiguous voxels which pass a statistical threshold to be considered "significant", since a single voxel alone is not reliable. Alternative summaries of the fMRI results include the voxel at the center of gravity, that is, the voxel in the "middle" of a cluster, which is often not the same as the maximal voxel, and in fact need not be statistically significant in its own right. There is also the option of reporting the proportion of the cluster in each of various distinct regions, or the size of the cluster as well as the maximal voxel. Any data mining effort combining results across different data sources must of course take care that what is combined is the same, across the various sources. This caveat, however, is independent from the representation of neuroanatomical knowledge in the ontology; the standardized labeling schema and the partonomy relationships are needed whether what is being annotated is the maximal voxel, the median voxel, the largest portion of the cluster, or any representation of the location of the results of the fMRI data analysis.
The choice of automated labeling techniques here was one of convenience, and does have the caveat that the labeling is not as precise as it might be if done by hand by a trained neuroanatomist Brodmann area 7 of superior parietal lobule, Brodmann area 7 of posterior segment of paracentral lobule, and Brodmann area 7 of precuneus. It projects to Brodmann area 6 of superior frontal gyrus, Brodmann area 6 of medial frontal gyrus, Brodmann area 6 of dorsal part of precentral gyrus, Brodmann area 8 of superior frontal gyrus, Brodmann area 8 of medial frontal gyrus, Brodmann area 9 of superior frontal gyrus, Brodmann area 9 of medial frontal gyrus. Parallel representations exist for both left and right SLF1. By retrieving from the ontology the areas which each SLF1 projects to and projects from, the previous DXBrain queries can be modified to identify which subjects have an activation cluster reported in an area that the SLF1 projects from, and another in an area that the SLF1 projects to, in the same hemisphere.
In this dataset from this analysis, very few subjects showed maximal voxels in regions connected by the SLF1. The majority of the activation in the postcentral gyrus was in Brodmann areas 1-3, rather than 5, and thus were not implicated in this query. Similarly, prefrontal and frontal activations were in areas not connected by the SLF1. Fifty-one subjects showed activation in at least one area that the SLF1 projects from, and 42 subjects showed activation in at least one area that SLF1 projects to, in either hemisphere. Only five subjects showed maximal voxels in both an area that SLF1 projects to and an area that SLF 1 projects from, and each case it was from a subregion of either left or right BA 7 to ipsilateral BA 6 of the medial or precentral gyrus. Of those five, four were subjects with schizophrenia, which assuming equal probabilities for either group has a probability of 0.15. As we note in section Discussion, the maximal voxel is a limited representation of the activation cluster, and other measures may have shown different results.

dIscussIon
In this work we have expanded the FMA to include anatomical entities represented by terms from a number of commonly used human neuroimaging atlases. We extracted the enhanced content as a view (NeuroFMA), incorporated the view into RadLex and then used the enhanced ontology to annotate and query a large test dataset of neuroimaging results. The process involves developing correlative representations of different but overlapping brain parcellations, so that segments of gyri can be identified with portions of Brodmann areas and vice versa. This integration has been represented previously in various software and atlases, as in the TD atlas, the implementation of the PALS atlas in SumsDB/Caret (Van Essen, 2005), and the multiple labels of the Brede database (Nielsen et al., 2004); of these, SumsDB at least allows queries to be performed across parcellation schemes in a single database. NeuroNames (Bowden et al., 2007) is also a standardized ontology which indicates a hierarchy of neuroanatomical structure; it includes very complete definitions of the Brodmann areas, for example, but not their subdivisions across sulci and gyri. Its release as an ontology has been incorporated into FMA, in that the same terms have been placed with the FMA hierarchy. The extension to the neuroanatomical module of FMA to include the TD terms is the first attempt to represent these two disparate human neuroanatomical parcellation schemes within a single formal ontology.
An ontology fulfills the need for externalization, formalization, and standardization of a body of knowledge (as discussed in Larson and Martone, 2009). An ontology serves to formalize knowledge June 2010 | Volume 4 | Article 10 | 10 Turner et al.
Neuroanatomical ontologies in neuroimaging The neuroanatomical modules within the FMA and RadLex ontologies largely include part_of and is_a relationships. Other spatio-structural relationships such as boundary and spatial relations (connectivity, adjacency, and location) are also needed to represent deeper neuroanatomical knowledge. While white matter and gray matter structures are comprehensively represented in the taxonomy, only some of the initial connections are included as used in the validation steps above. Representation of developmental history of the different regions would also be relevant. All of these needs have been previously identified and some are being actively developed. The power of ontological representations of neuroanatomical knowledge is only fully realized when it is combined with ontologies from other domains in a knowledge engineering framework.
In this study we have demonstrated how a reference ontology such as the FMA can provide a particular subset of the ontology to support user applications. We applied the neuroanatomical knowledge of the FMA to expedite the programmatic analyses of data. Because the knowledge derived is computable and machineprocessable, its correlation and integration with other orthogonal ontologies can be more readily achieved. Although the analyses performed in this study required knowledge of which data came from which diagnostic group, for example, which is not part of the FMA or RadLex per se, additional structured knowledge related to the data can be derived from other ontologies such as NIFSTD (Bug et al., 2008b). Information such as the fMRI scanning methods can be formally annotated using terms and properties from RadLex, while the details of subject recruitment and the simultaneous collection of behavior and imaging data can come from extensions to the Ontology of Biomedical Investigations (OBI; Brinkman et al., in press). The same is true for specifying the full representation of the cognitive paradigm used during the scanning, which will come from the Cognitive Paradigm Ontology currently under development (www.cogpo. org). A formal representation of the relationship between the experimental design, the data, the analysis, and the conclusions drawn from the analyses is the goal of the project for Knowledge Engineering from Experimental Design (KEfED; Burns and Russ, 2009). The full capacity to annotate and re-use neuroimaging data to automatically derive conclusions regarding the function of various brain areas in cognitive function or dysfunction, will require the fuller development of all of these representations and their broader use within the neuroimaging clinical and research community.
acknowledgMents Support for this work came from contract number HHSN268200800020C from the National Institute of Biomedical Imaging and Bioengineering (NIBIB) through the Radiological Society of North America; the Biomedical Informatics Research Network, 1 U24 RR025736-01, and the Functional Imaging in Biomedical Informatics Research Network (FBIRN), 1 U24 RR021992; grant HL087706 from the National Heart, Lung and Blood Institute (NHLBI); and we also acknowledge support through a grant from the National Cancer Institute (NCI) through the cancer Biomedical Informatics Grid (caBIG) Imaging Workspace, subcontract from Booz-Allen & Hamilton, Inc.
viewing the original native images. Devlin and Poldrack (2007) point out that localization based on the Brodmann areas labeled in the Talairach and Tourneaux atlas are imprecise, and argue for "tedious anatomy" with manual reference to neuroanatomical atlases. The quantitative comparisons of the TD output with other standardized atlases performed by Bohland et al. (2009) support this, in that the Talairach atlas labels showed among the least concordance with other atlases when coregistered to a standard brain. The concordance, however, was improved dramatically when the label was assigned using the "nearest gray matter" options, which is what we did here. This raises the question of the precision of the labeling; but it does not question the ontological basis for the label itself. Is there a part of Brodmann area 6 that is in the precentral gyrus, and another part in the middle frontal gyrus? The ability to apply that distinction meaningfully may not be provided by the TD yet, but other efforts such as (Scheperjans et al., 2008a,b) are continuing to identify means for standardizing where the boundaries are between morphologically defined regions and cytoarchitectural regions. Whether the ability to use those terms to identify consistent cortical regions is sufficiently reliable now, or will be in the future, the ability to place the concepts for those regions within the hierarchy, represent their relationships, and reason about their parts or connections is facilitated by this extension to the FMA.
A similar and more general caveat is the limitations of brain region definitions themselves, for example, absolute boundaries for regions such as the gyri are not agreed upon across nomenclatures and atlases. Even though we can identify the term middle frontal gyrus from NeuroNames (Bowden and Martin, 1995) or the human PALS-B12 atlas (Van Essen, 2005) as being in some way similar to the term RightCerebrum.FrontalLobe.MiddleFrontalGyrus from the TD, the exact boundaries of the middle frontal gyrus will vary in usage from one atlas to another (see also Joshi et al., 2009). This is a challenge for FMA as it attempts to harmonize across different atlases. The Structural Lexicon Task Force of the International Neuroinformatics Coordinating Facility (INCF; www.incf.org) has been facing this same issue: When one nomenclature identifies a particular region as beginning and ending in the minima of the bounding sulci, for example, while another nomenclature identifies the same label but puts the boundaries of the region at the maximal point of curvature between sulcus and gyrus, the two regions are defined differently. All the different spatial localization efforts and improved coregistration methods, or even manual annotation, will never harmonize those nomenclatures. The current recommendation from the Task Force thus far has been to include each definition as a different concept or type, and differentiating the labels by "Area X as defined by system Y"; each concept defined should be "anchored" in some way to a classical structure, e.g., that it is a type of middle frontal gyrus (M. Martone, member of INCF, personal communication). The FMA can serve as a source or reference ontology for the classical structures. Efforts like the Brain Architecture Management System (BAMS, Swanson, 2008, 2010) are including spatial reasoning methods to work with these overlapping but not identical definitions of brain regions, and formalized logical reasoning has been developed for it working with the CoCoMac connectivity database (see Stephan et al., 2000).
The work that we have done here is only the beginning of the extensive development necessary to support application programs used in neuroscience research and clinical practice.