AUTHOR=Mohtad Younus Muhammad , Iqbal Arshad , Durrani Esha e Noor , Ahmad Naveed , Ladan Mohamad 

TITLE=A hybrid voice cloning for inclusive education in low-resource environments

JOURNAL=Frontiers in Computer Science

VOLUME=Volume 7 - 2025

YEAR=2025

URL=https://www.frontiersin.org/journals/computer-science/articles/10.3389/fcomp.2025.1675616

DOI=10.3389/fcomp.2025.1675616

ISSN=2624-9898

ABSTRACT=IntroductionVoice cloning can personalize speech technologies but typically requires large datasets and compute, limiting use in low-resource educational settings.MethodsWe propose a hybrid pipeline combining a GE2E-trained speaker encoder, a Tacotron-based text-to-spectrogram synthesizer, and a modified WaveRNN vocoder with gated GRUs and skip connections. The system targets few-shot adaptation (5–10 s of target speech) and near real-time synthesis on modest hardware.ResultsOn LibriSpeech, VCTK, and noisy YouTube/local corpora, the system achieves MCD ≈ 4.8–5.1 and improves MOS over baselines (e.g., LibriSpeech: 4.55 vs. 4.33; YouTube: 3.82 vs. 3.10), with EER < 12% on an external ASV, indicating strong speaker similarity.DiscussionResults show data-efficient, robust voice cloning suitable for inclusive education, with practical considerations for deployment (compute, noise) and responsible use (consent, watermarking, detection). The approach supports assistive and multilingual classroom scenarios in low-resource contexts.