Arabic Dialect Speech-Text Recognition Using Deep Learning

dc.contributor.advisorDr Manar Al Khatib
dc.contributor.authorRAEIALBOOM, OMAR SALEH DARWISH
dc.date.accessioned2025-01-23T08:47:09Z
dc.date.available2025-01-23T08:47:09Z
dc.date.issued2024-09
dc.description.abstractRecently, the dominant utilization of media networking has emphasised the importance of precisely identifying users’ feelings, covering a spectrum from contentment to dissatisfaction, in the domain of online communications. The dissertation addresses the challenges of accurately transcribing Arabic speech due to the language’s complexity, limited audio resources, and diverse regional variations. Traditional speech recognition models struggle in this domain, prompting an exploration of deep learning approaches, specifically the TestRCNN and Hybrid TestRCNN-CNN Models. The research begins with a comprehensive preprocessing process, which involves loading audio data, extracting features using Mel-Frequency Cepstral Coefficients (MFCCs), and encoding labels. Both models are trained and evaluated on a curated dataset of Arabic speech samples, capturing the spatial and temporal features. The TestRCNN Model combines convolutional layers for local feature extraction and recurrent layers to capture temporal dependencies. It achieves an accuracy of 93% and a word error rate (WER) of 0.0986, but faces difficulties in distinguishing closely related phonetic sounds. To address these limitations, a hybrid approach is proposed, combining the TestRCNN and CNN architectures. This hybrid model leverages the CNN’s ability to extract detailed spatial features and the TestRCNN’s proficiency in capturing long-term dependencies. The Hybrid TestRCNN-CNN Model outperforms the TestRCNN, achieving an accuracy of 94% and significantly reducing the (WER) to 0.0460. The dissertation provides a detailed comparison of these models’ operational features, hyperparameters, and outcomes. Through extensive experimentation, the study highlights the hybrid approach’s advantages in accurately transcribing Arabic speech and contributes valuable insights to the field of Arabic speech recognition research.
dc.identifier.other23000715
dc.identifier.urihttps://bspace.buid.ac.ae/handle/1234/2753
dc.language.isoen
dc.publisherThe British University in Dubai (BUiD)
dc.subjectArabic speech recognition, deep learning, TestRCNN, hybrid TestRCNN-CNN
dc.titleArabic Dialect Speech-Text Recognition Using Deep Learning
dc.typeDissertation
Files