METHODS article

Front. Lang. Sci.

Sec. Language Processing

Volume 4 - 2025 | doi: 10.3389/flang.2025.1569448

Project Euphonia: Advancing Inclusive Speech Recognition through Expanded Data Collection and Evaluation

Provisionally accepted
Alicia  MartinAlicia Martin1Robert  MacDonaldRobert MacDonald1*Pan-Pan  JiangPan-Pan Jiang1Marilyn  LadewigMarilyn Ladewig2Julie  CattiauJulie Cattiau1Rus  HeywoodRus Heywood1Richard  CaveRichard Cave3Jimmy  TobinJimmy Tobin1Philip  C NelsonPhilip C Nelson1Katrin  TomanekKatrin Tomanek1
  • 1Google (United States), Mountain View, United States
  • 2Cerebral Palsy Association of New York State, New York, United States
  • 3Motor Neurone Disease Association, Northampton, United Kingdom

The final, formatted version of the article will be published soon.

Speech recognition models, predominantly trained on standard speech, often exhibit lower accuracy for individuals with accents, dialects, or speech impairments. This disparity is particularly pronounced for economically or socially marginalized communities, including those with disabilities or diverse linguistic backgrounds. Project Euphonia, a Google initiative originally launched in English dedicated to improving Automatic Speech Recognition (ASR) of disordered speech, is expanding its data collection and evaluation efforts to include international languages like Spanish, Japanese, French and Hindi, in a continued effort to enhance inclusivity. This paper presents an overview of the extension of processes and methods used for English data collection to more languages and locales, progress on the collected data, and details about our model evaluation process, focusing on meaning preservation based on Generative AI.

Keywords: Disordered speech, automatic speech recognition, Speech data collection, Dysarthria, artificial intelligence

Received: 04 Mar 2025; Accepted: 16 May 2025.

Copyright: © 2025 Martin, MacDonald, Jiang, Ladewig, Cattiau, Heywood, Cave, Tobin, Nelson and Tomanek. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Robert MacDonald, Google (United States), Mountain View, United States

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.