Language Models for Low-Resource Languages

About this Research Topic

Submission deadlines

  1. Manuscript Summary Submission Deadline 5 May 2026 | Manuscript Submission Deadline 23 August 2026

  2. This Research Topic is currently accepting articles.

Background

Language is the medium through which we access the world, and it reflects the cultural dimension of the people. The development of artificial intelligence (AI) language models (LMs) is a long story. Since the introduction of transformer architecture in 2017, the adoption of large language models (LLMs) capable of engaging in dialogue, answering queries, and generating human-like content has grown. These advancements offer great opportunities for new technological applications and services that will benefit people.

Globally, there are approximately 7,000 spoken languages. However, most LLMs focus only on about 50 languages with high resources.

Though they have little digital presence, minority (low-resource) languages are a large and culturally significant reality. In many regions, they are spoken by a significant portion of the vulnerable population. Language barriers reduce opportunities for quality education, healthcare, financial access, employment, and other services that contribute to a high quality of life.

Although the many low-resource languages represent significant global communities, they generally lack the digital data and resources necessary to support AI-based LM tasks or benefit from recent advancements in the field.

Particularly, low-resource languages are subject to two significant limitations: a shortage of labeled and unlabeled language data, as well as data of poor quality that does not sufficiently represent the languages and their sociocultural contexts.

Some efforts have been made through workshops and conferences (e.g. Conference of the European Chapter of the Association for Computational Linguistics - ECAL; LoResLM; IberLEF, Conference on LM - COLM, etc.), some of which are more well-established. However, a special issue in a high-impact journal would be an opportunity to consolidate efforts and address challenges in the field. Ultimately, this special issue aims to advance responsible AI innovation that empowers low-resource language communities and shapes a more inclusive future for global language technologies.

We invite submissions on a broad range of topics related to the development and evaluation of language models for low-resource languages, including but not limited to the following:



• Building LMs for low-resource languages.

• Adapting/extending existing LMs/LLMs for low-resource languages.

• Corpora creation and curation technologies for training LMs/LLMs for low-resource languages.

• Strategies for Overcoming Data Scarcity.

• Benchmarks to evaluate LMs/LLMs in low-resource languages.

• Prompting/in-context learning strategies for low-resource languages with LLMs.

• Promoting participatory research with low-resource language communities.

• Review of available corpora to train/fine-tune LMs/LLMs for low-resource languages.

• Multilingual/cross-lingual LMs/LLMs for low-resource languages.

• Applications of LMs/LLMs for low-resource languages (i.e. machine translation, chatbots, content moderation, etc.)

• Bias and fairness in low-resource language technologies

• Sociolinguistic considerations in technology development

• Cultural appropriateness and sensitivity

Research Topic Research topic image

Article types and fees

This Research Topic accepts the following article types, unless otherwise specified in the Research Topic description:

  • Brief Research Report
  • Conceptual Analysis
  • Data Report
  • Editorial
  • FAIR² Data
  • FAIR² DATA Direct Submission
  • General Commentary
  • Hypothesis and Theory
  • Methods

Articles that are accepted for publication by our external editors following rigorous peer review incur a publishing fee charged to Authors, institutions, or funders.

Keywords: Language Models (LMs), Low-Resource Languages, Adaptive Large Language Models (LLMs), Corpora creation, Multilingual pre-training, Cross lingual knowledge transfer, Ethical consideration, Cultural sensitivity

Important note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic editors

Manuscripts can be submitted to this Research Topic via the main journal or any other participating journal.