About this Research Topic

Manuscript Submission Deadline 01 May 2022
Manuscript Extension Submission Deadline 18 July 2022

The success of state-of-the-art Natural Language Processing (NLP) solutions heavily depends on the availability of large amounts of annotated and unannotated data. Although a few high-resource languages have access to such large-scale datasets, most of the 6500+ global languages do not share the same luxury. Even for high-resource languages, domain-specific data can be scarce. Thus, there has been a growing interest in recent years in developing NLP solutions that work well in low-resource settings. There have also been research efforts to create new resources, as well as benchmarks. In turn, these efforts lead to questions not only of how to build these new resources, but of how they are used: as low resource NLP research emerges, issues such as bias, fairness and interpretation in new cultures also come into play. In particular, the negative impacts of using high-resource languages and domains to support the learning of computational models for low resource languages and domains are yet to be explored.

While some low-resource NLP research may focus on creating novel language resources and benchmarks, some may customize existing NLP solutions to new languages and domains. There are also novel NLP techniques that are equally applicable for both low-resource languages and low-resource domains. In parallel, there can be research that actively explores new NLP techniques that could generalize to different low resource setups - in terms of data availability and the availability of computational resources. The field is in dire need of a common venue that welcomes research in these multiple directions. This Research Topic aims at painting a broad picture of the state of studies on this wide spectrum of topics related to NLP for low resource languages and domains.

We welcome any type of contributions related to low resource languages and domains. These include, but are not limited to:

• NLP Techniques for low-resource languages or domains
• Domain adaptation
• Transfer learning
• Zero-shot and few-shot learning
• Meta learning
• Knowledge distillation
• Multilingual and cross-lingual learning
• Data augmentation

• Dataset and Evaluation for low-resource languages or domains
• New benchmarks
• evaluation mechanism
• Multimodal resources Language resources (multilingual/monolingual, annotated/unannotated)

• NLP Tasks for low-resource languages or domains
• Bias, fairness and ethics in NLP
• Dialog and interactive systems
• Discourse and pragmatics
• Document analysis including text categorization, topic models, and retrieval
• Natural language generation
• Information extraction, text mining, and question answering
• Language-inclusive multimodal integration
• Machine translation
• Multilinguality
• Phonology, morphology, and word segmentation
• Semantics
• Text classification
• Fake news and hate-speech detection
• Sentiment analysis and opinion mining
• Social media analysis: Twitter, blogs, discussion forums, and other social media
• Speech, prosody, and spoken dialog
• Summarization
• Tagging, chunking, syntax, and parsing

Keywords: NLP, low-resource languages, low-resource scenarios and settings, domain adaptation, language resources, evaluation benchmarks


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

The success of state-of-the-art Natural Language Processing (NLP) solutions heavily depends on the availability of large amounts of annotated and unannotated data. Although a few high-resource languages have access to such large-scale datasets, most of the 6500+ global languages do not share the same luxury. Even for high-resource languages, domain-specific data can be scarce. Thus, there has been a growing interest in recent years in developing NLP solutions that work well in low-resource settings. There have also been research efforts to create new resources, as well as benchmarks. In turn, these efforts lead to questions not only of how to build these new resources, but of how they are used: as low resource NLP research emerges, issues such as bias, fairness and interpretation in new cultures also come into play. In particular, the negative impacts of using high-resource languages and domains to support the learning of computational models for low resource languages and domains are yet to be explored.

While some low-resource NLP research may focus on creating novel language resources and benchmarks, some may customize existing NLP solutions to new languages and domains. There are also novel NLP techniques that are equally applicable for both low-resource languages and low-resource domains. In parallel, there can be research that actively explores new NLP techniques that could generalize to different low resource setups - in terms of data availability and the availability of computational resources. The field is in dire need of a common venue that welcomes research in these multiple directions. This Research Topic aims at painting a broad picture of the state of studies on this wide spectrum of topics related to NLP for low resource languages and domains.

We welcome any type of contributions related to low resource languages and domains. These include, but are not limited to:

• NLP Techniques for low-resource languages or domains
• Domain adaptation
• Transfer learning
• Zero-shot and few-shot learning
• Meta learning
• Knowledge distillation
• Multilingual and cross-lingual learning
• Data augmentation

• Dataset and Evaluation for low-resource languages or domains
• New benchmarks
• evaluation mechanism
• Multimodal resources Language resources (multilingual/monolingual, annotated/unannotated)

• NLP Tasks for low-resource languages or domains
• Bias, fairness and ethics in NLP
• Dialog and interactive systems
• Discourse and pragmatics
• Document analysis including text categorization, topic models, and retrieval
• Natural language generation
• Information extraction, text mining, and question answering
• Language-inclusive multimodal integration
• Machine translation
• Multilinguality
• Phonology, morphology, and word segmentation
• Semantics
• Text classification
• Fake news and hate-speech detection
• Sentiment analysis and opinion mining
• Social media analysis: Twitter, blogs, discussion forums, and other social media
• Speech, prosody, and spoken dialog
• Summarization
• Tagging, chunking, syntax, and parsing

Keywords: NLP, low-resource languages, low-resource scenarios and settings, domain adaptation, language resources, evaluation benchmarks


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic Editors

Loading..

Topic Coordinators

Loading..

articles

Sort by:

Loading..

authors

Loading..

views

total views article views article downloads topic views

}
 
Top countries
Top referring sites
Loading..

Share on

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

About Frontiers Research Topics

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.