About this Research Topic
Recent advances in AI, including automatic speech recognition (ASR) in low-resource languages, few-shot and transfer learning, advanced deep-learning architectures, and self-supervised pretraining methods, afford new opportunities for low-resource speech sciences. For example, in the ADReSS Challenge (Alzheimer’s Dementia Recognition through Spontaneous Speech) at Interspeech 2020, a number of teams applied the scheme of “fine-tuning pre-trained language models” on the task. With only 108 subjects’ data for training (fine-tuning), the winning team achieved 90% accuracy on the test set using this scheme. Collecting data in low-resource languages and settings has now become convenient and efficient. LanguageARC (https://languagearc.com), an online data collection and analysis platform developed by the Linguistic Data Consortium, provides an excellent example. The challenge remains, however, in bridging data and research in low-resource languages.
The proposed Research Topic aims to promote the application of artificial intelligence and speech technologies to the collection, analysis, and understanding of speech from low-resource natural languages and atypical speakers. As part of the Research Topic, we plan to collect and make available a number of datasets in low-resource languages through sentence reading, object-naming, and picture description on LanguageARC. With these datasets, research can be conducted to create a grammar sketch of an unknown language (e.g., sound structure, lexicon, morphology, etc.). Papers based on these or other datasets will be published in the Research Topic. The Research Topic will also include papers in language disorders. Automatic detection of disorders through language and speech is a long-standing interest and effort. Recent development in AI technology has brought promising breakthroughs. The Research Topic will present the latest effort in this area, on both data collection and research outcomes.
The future success of inclusive speech technology for atypical speech communication and low-resource languages would benefit from further communication between speech technologists and speech scientists. While these domains have traditionally been treated separately, they share the challenges of populations whose language has been largely overlooked during the development of AI technologies leading to a lack of standard resources and thus poorly understood phenotypes or observable surface characteristics (e.g. in Cebuano, the extraordinary degree of borrowing from English and Spanish). This Research Topic welcomes submissions that promote interdisciplinary research and collaboration in the application of artificial intelligence and speech technologies to the collection, analysis, and understanding of speech from low-resource natural languages and atypical speakers. We envision contributions including but not limited to:
1. research agenda for both the AI sector and the speech scientists to work together for cross-fertilization;
2. computational programs for efficient data collection and analysis;
3. novel methods for creating a grammar sketch of an unknown language;
4. original research on automatic assessment of language disorders.
Keywords: Low-resource, Speech Science, Speech Technology, Language Disorders, low-resource languages, human language technology
Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.