Multilingual Speech Transcription System for Kazakh, Russian and English Languages
DOI:
https://doi.org/10.20508/95kg3z55Keywords:
Multilingual Speech Transcription, Offline Speech Recognition, Language Detection, Audio Preprocessing, Digital ForensicsAbstract
This paper presents a novel multilingual speech transcription system designed for Kazakh, Russian, and English languages. Unlike existing solutions such as OpenAI Whisper and Kaldi-based offline models, the proposed system introduces three key innovations: (1) a specialized preprocessing pipeline optimized for Kazakh phonetic characteristics, (2) dynamic language detection with confidence scoring, and (3) fine-tuned acoustic models trained on a comprehensive trilingual dataset of 450 hours. The system achieves a 23% better Word Error Rate (WER) for Kazakh (8.7% vs. 11.3%) compared to Whisper and a 15% improvement for code-switched utterances. Statistical significance testing using paired t-tests (p < 0.001) and 95% confidence intervals confirm the superiority of the proposed approach across all target languages. Unlike existing solutions such as OpenAI Whisper which achieves 11.3% WER for Kazakh, the proposed system demonstrates 23% improvement with 8.7% WER (p < 0.001, 95% CI: [8.1%, 9.3%]), while maintaining complete offline functionality and specialized optimization for Central Asian linguistic patterns. However, OpenAI Whisper supports 99 languages, while proposed system supports 3 languages, and has better zero‑shot performance.
Downloads
Downloads
Published
Issue
Section
License
Licensing
All articles published in the Artificial Intelligence Research and Applications, AIRA, are licensed under an open access Creative Commons CC BY 4.0 license, meaning that anyone may download and read the paper for free. In addition, the article may be reused and quoted provided that the original published version is cited. These conditions allow for maximum use and exposure of the work, while ensuring that the authors receive proper credit.
In exceptional circumstances articles may be licensed differently. If you have specific condition (such as one linked to funding) that does not allow this license, please mention this to the editorial office of the journal at submission. Exceptions will be granted at the discretion of the publisher.