Event Abstract

Linked Neuron Data (LND): A Platform for Integrating and Semantically Linking Neuroscience Data and Knowledge

  • 1 Institute of Automation, Chinese Academy of Sciences, China

Linked Neuron Data (LND) is an effort and a Web-based platform for integrating and semantically linking Neuroscience data and knowledge from multiple scales and multiple data sources together to support comprehensive understanding of the brain. Currently LND integrates structured neuroscience knowledge from Allen Brain Atlas [Sunkin et al. 2013], NeuroLex [Larson and Martone 2013], NIF Ontology [Imam et al. 2012], NeuroMorpho [Ascoli et al. 2007], Mesh terms, etc. It also extracts declarative domain knowledge from unstructured sources such as PubMed abstracts, Neuroscience literatures and books and the extracted knowledge is represented as triples, such as < apical dendrite, part of, pyramidal neuron> (247,239 triples are extracted by using pattern based information extraction. Considering the quality of the extracted triples, current extractions focus on is-a, part-whole, synonyms relations, and attribute value pairs for specific entities, etc.). All the integrated and extracted knowledge are represented in RDF/OWL. Currently, Linked Neuron Data contains 2,567,178 semantic knowledge triples that describe various declarative knowledge on Neuroscience, including: (1) hierarchical organizations of the brain (with hierarchical brain regions, types of neurons in the specific region, etc. as shown in Figure 1); (2) links among different brain components from different species; (3) relationships among brain components, brain diseases and cognitive functions (including 25,497 triples, extracted from PubMed articles, and Wikipedia pages); (4) facts about brain components (e.g. location distributions, functions, and neurotransmitters of certain types of neurons).

Compared to many other efforts (such as NeuroLex [Larson and Martone 2013], whose knowledge is mainly from experts manual contributions), most of the knowledge in Linked Neuron Data (LND) is integrated or automatically extracted from structured and unstructured data sources. And the focus of LND is on how to semantically link knowledge from different data sources together and how neuroscience researchers can benefit from the links and this neuroscience knowledgebase in general.

Links among different resources are automatically constructed by the step wise bag-of-words entity linking algorithm developed in our previous study for creating very large scale semantic knowledge bases [Zeng et al. 2013a, Zeng et al. 2013b]. Several efforts are made to ensure the quality of the entity linking process for LND. Firstly, we consider automatic direct mapping when different resources share the same term. Secondly, synonyms of different neuroscience domain terms are used to link these resources together. These pairs of synonyms are extracted from Wikipedia redirects (7,222,839 pairs in all, and 23,270 pairs contain neuroscience domain terms in the current LND knowledge base), Allen Brain Atlas (4,756 pairs), and NeuroLex (103,704 pairs), etc. For those who share the same term while actually do not share the same meaning, we differentiate these resources through entity disambiguation process by using the tool developed in [Zeng et al. 2013a] (For example, currently there are 5 pages for the term “hippocampus” on Wikipedia. They refer to a type of animal, a brain region, a type of magical beast, etc. When linking knowledge about “hippocampus” from Wikipedia to Allen Brain Atlas, the program will automatically find the right “hippocampus” to be linked with). Currently, LND has already established 31,940 links among different knowledge sources (e.g. Links among domain terms from Allen Brain Atlas, NeuroLex and DBPedia/Wikipedia). 1322 entity disambiguation process were done, and the precision is 91.2%. All the entities that are semantically equal to each other are linked together by the owl:sameAs relation so that these knowledge can be used for semantic search and reasoning tasks over the Linked Neuron Data platform.

With the entity linking efforts, Neuroscience knowledge from different sources are connected together, and Neurosciences researchers can benefit from these links. For example, LND users can obtain 265 piece of knowledge about “hippocampus” from 6 sources (including Allen Brain Atlas, NIF Ontology, NeuroLex, CAS Brain Ontology, Neuromorpho and DBPedia). LND can also identify how many brain regions contain pyramidal neurons through the integration and linking of NeuroLex, Wikipedia, and Neuromorpho knowledge.

The Linked Neuron Data (LND) platform can be accessed through http://www.linked-neuron-data.org. Figure 2 provides a screen shot of the LND platform. The neuroscience knowledge in LND can be obtained by issuing SPARQL or keyword queries to the platform, and the query results can be downloaded directly or displayed on the LND Web interface.

Figure 1
Figure 2

References

[Sunkin et al. 2013] Sunkin, S.M., Ng, L., Lau, C., Dolbeare, T., Gilbert, T.L., Thompson, C.L., Hawrylycz, M., Dang, C. (2013). Allen Brain Atlas: an integrated spatio-temporal portal for exploring the central nervous system. Nucleic acids research 41(D1): D996-D1008.
[Larson and Martone 2013] Larson, S.D., Martone, M. (2013). NeuroLex.org: an online framework for neuroscience knowledge. Frontiers in Neuroinformatics 7:18.
[Imam et al. 2012] Imam, F.T., Larson, S.D., Bandrowski, A., Grethe, J.S., Gupta, A., Martone, M.E. (2012). Development and use of ontologies inside the Neuroscience Information Framework: a practical approach. Frontiers in Genetics 3:111.
[Ascoli et al. 2007] Ascoli, G.A., Donohue, D.E., Halavi, M. (2007). NeuroMorpho.Org: a central resource for neuronal morphologies. The Journal of Neuroscience 27(35): 9247-9251.
[Zeng et al. 2013] Zeng, Y., Wang, D.S., Zhang, T.L., Wang, H., Hao, H.W. (2013). Linking entities in short texts based on a Chinese semantic knowledge base. Communications in Computer and Information Science 400: 266-276.
[Zeng et al. 2013b] Zeng, Y., Wang, D.S., Zhang, T.L., Wang, H., Hao, H.W., Xu, B. (2013). CASIA-KB: A multi-source Chinese semantic knowledge base built from structured and unstructured Web data. Proceedings of the third Joint International Semantic Technology Conference, Seoul, Korea, Springer.

Keywords: neuroinformatics, Knowledge Bases, multi-scale, Entity linking, Entity disambiguation, Semantic Web, Knowledge representation

Conference: Neuroinformatics 2014, Leiden, Netherlands, 25 Aug - 27 Aug, 2014.

Presentation Type: Demo, to be considered for oral presentation

Topic: Infrastructural and portal services

Citation: Zeng Y, Wang D, Zhang T and Xu B (2014). Linked Neuron Data (LND): A Platform for Integrating and Semantically Linking Neuroscience Data and Knowledge. Front. Neuroinform. Conference Abstract: Neuroinformatics 2014. doi: 10.3389/conf.fninf.2014.18.00017

Received: 04 Apr 2014; Published Online: 04 Jun 2014.

* Correspondence:
Dr. Yi Zeng, Institute of Automation, Chinese Academy of Sciences, Beijing, China, yi.zeng@ia.ac.cn
Mr. Dongsheng Wang, Institute of Automation, Chinese Academy of Sciences, Beijing, China, dongsheng.wang@ia.ac.cn

© 2007 - 2017 Frontiers Media S.A. All Rights Reserved