Research Topic

Mining Scientific Papers Volume II: Knowledge Discovery and Data Exploitation

About this Research Topic

This research topic aims at promoting interdisciplinary research in computational linguistics and natural language processing (NLP) in the field of bibliometric/scientometrics and information retrieval. It is a follow-up of the Research Topic Mining Scientific Papers: NLP-enhanced Bibliometrics.

The processing of scientific writing, which includes the analysis of citation contexts but also information extraction from scientific papers for various applications, has been the object of intensive research during the last decade. This has become possible thanks to two factors. The first one is the growing availability of scientific papers in full text and in machine-readable formats as well as the rise of Open Access publishing on online platforms such as ArXiv, CiteSeer or PloS. The second one is the relative maturity of open source tools and libraries for natural language processing that facilitate text processing (e.g. NLTK, Mallet, OpenNLP, CoreNLP, Gate, CiteSpace). As a result, a large number of studies has been dedicated to citation context analysis, but also summarization and recommendation of scientific papers.

This Research Topic aims to discuss novel approaches that focus on the processing and exploitation of data extracted from scientific literature. In particular, the possibility to enrich metadata by the full-text processing of papers offers new fields of investigation that are related to the representation of data and the production of knowledge by the aggregation of data from multiple documents. Given the wide range of available techniques, several questions arise in this field: What volume of scientific data should be considered exploitable and allow the production of new knowledge through aggregation? How can knowledge generated from data in scientific articles be represented? What types of data and knowledge can be automatically extracted from scientific articles and how can it be exploited efficiently?

We also invite papers (e.g. Brief Research Reports, Data Reports, Methods, Opinions, Original Research, ...) produced by participants of the recently launched COVID-19 Open Research Dataset Challenge (CORD-19). Teams who work on the CORD-19 dataset and address tasks which relate to this RT are welcome to approach the organizers and submit.

The objective of this Research Topic is to share interdisciplinary techniques around the theme of text mining applied to scientific articles. The spectrum of the topic includes computational linguistics as well as machine learning approaches to enrich the content of scientific articles and to facilitate the exploitation of their data. It also covers rule-based techniques, the implementation of grammars and artificial intelligence, and methods of improving the way large-scale text analysis, text mining, and sense mining of scientific articles can benefit from these techniques.

Topics under the research theme include, but are not limited to:
• Information extraction, text mining and parsing of scholarly literature
• Datasets for mining scientific papers
• Natural Language Processing (NLP) applied to citation analysis, recommendation and classification
• Discourse modeling and argument mining
• Methodology and models for content citation analysis
• Semantic and network-based indexing, search and navigation in structured text
• Knowledge discovery and visualization
• Information seeking behavior and human-computer interaction in academic search
• Scientific document engineering
• Data exploitation and information extraction from scientific articles
• The emergence of research questions in text processing for bibliometrics purposes.


Keywords: Text Mining, Information Retrieval, Natural Language Processing, Academic Search, Citation Contexts


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

This research topic aims at promoting interdisciplinary research in computational linguistics and natural language processing (NLP) in the field of bibliometric/scientometrics and information retrieval. It is a follow-up of the Research Topic Mining Scientific Papers: NLP-enhanced Bibliometrics.

The processing of scientific writing, which includes the analysis of citation contexts but also information extraction from scientific papers for various applications, has been the object of intensive research during the last decade. This has become possible thanks to two factors. The first one is the growing availability of scientific papers in full text and in machine-readable formats as well as the rise of Open Access publishing on online platforms such as ArXiv, CiteSeer or PloS. The second one is the relative maturity of open source tools and libraries for natural language processing that facilitate text processing (e.g. NLTK, Mallet, OpenNLP, CoreNLP, Gate, CiteSpace). As a result, a large number of studies has been dedicated to citation context analysis, but also summarization and recommendation of scientific papers.

This Research Topic aims to discuss novel approaches that focus on the processing and exploitation of data extracted from scientific literature. In particular, the possibility to enrich metadata by the full-text processing of papers offers new fields of investigation that are related to the representation of data and the production of knowledge by the aggregation of data from multiple documents. Given the wide range of available techniques, several questions arise in this field: What volume of scientific data should be considered exploitable and allow the production of new knowledge through aggregation? How can knowledge generated from data in scientific articles be represented? What types of data and knowledge can be automatically extracted from scientific articles and how can it be exploited efficiently?

We also invite papers (e.g. Brief Research Reports, Data Reports, Methods, Opinions, Original Research, ...) produced by participants of the recently launched COVID-19 Open Research Dataset Challenge (CORD-19). Teams who work on the CORD-19 dataset and address tasks which relate to this RT are welcome to approach the organizers and submit.

The objective of this Research Topic is to share interdisciplinary techniques around the theme of text mining applied to scientific articles. The spectrum of the topic includes computational linguistics as well as machine learning approaches to enrich the content of scientific articles and to facilitate the exploitation of their data. It also covers rule-based techniques, the implementation of grammars and artificial intelligence, and methods of improving the way large-scale text analysis, text mining, and sense mining of scientific articles can benefit from these techniques.

Topics under the research theme include, but are not limited to:
• Information extraction, text mining and parsing of scholarly literature
• Datasets for mining scientific papers
• Natural Language Processing (NLP) applied to citation analysis, recommendation and classification
• Discourse modeling and argument mining
• Methodology and models for content citation analysis
• Semantic and network-based indexing, search and navigation in structured text
• Knowledge discovery and visualization
• Information seeking behavior and human-computer interaction in academic search
• Scientific document engineering
• Data exploitation and information extraction from scientific articles
• The emergence of research questions in text processing for bibliometrics purposes.


Keywords: Text Mining, Information Retrieval, Natural Language Processing, Academic Search, Citation Contexts


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

About Frontiers Research Topics

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.

Topic Editors

Loading..

Submission Deadlines

01 September 2020 Manuscript

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

Loading..

Topic Editors

Loading..

Submission Deadlines

01 September 2020 Manuscript

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

Loading..
Loading..

total views article views article downloads topic views

}
 
Top countries
Top referring sites
Loading..