ORIGINAL RESEARCH article
Front. Artif. Intell.
Sec. Machine Learning and Artificial Intelligence
Volume 8 - 2025 | doi: 10.3389/frai.2025.1630743
Efficient Spatio-temporal Modeling for Sign Language Recognition using CNN and RNN Architectures
Provisionally accepted- 1Nelson Mandela African Institution of Science and Technology, Arusha, Tanzania
- 2Mzumbe, Morogoro, Tanzania
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Computer vision has been identified as one of the solutions to bridge communication barriers between speech-impaired populations and those without impairment, as most people are unaware of the sign language used by speech-impaired individuals. Numerous studies have been conducted to address this challenge. However, recognizing word signs, which are usually dynamic and involve more than one frame per sign, remains a challenge. This study used Tanzania Sign Language datasets collected using mobile phone selfie cameras to investigate the performance of deep learning algorithms that capture spatial and temporal relationships features of video frames. The study used CNN-LSTM and CNN-GRU architectures, where CNN-GRU with an ELU activation function is proposed to enhance learning efficiency and performance. The findings indicate that the proposed CNN-GRU model with ELU activation achieved an accuracy of 94%, compared to 93% for the standard CNN-GRU model and CNN-LSTM. Additionally, the study evaluated performance of the proposed model in a signer-independent setting, where results varied significantly across individual signers, with the highest accuracy reaching 66%. These results show that more effort is required to improve signer independence performance, including the challenges of hand dominance by optimizing spatial features.
Keywords: CNN-GRU, CNN-LSTM, deep learning, ELU activation function, sign language, Tanzania Sign Language
Received: 18 May 2025; Accepted: 04 Aug 2025.
Copyright: © 2025 Myagila, Nyambo and Dida. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Kasian Myagila, Nelson Mandela African Institution of Science and Technology, Arusha, Tanzania
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.