Automatic Recognition of Poets for Arabic Poetry using Deep Learning Techniques (LSTM and Bi-LSTM)

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
The British University in Dubai (BUiD)
Arabic poetry with its beauty, deep cultural importance and linguistic features, has always been a subject of attraction for scholars and readers. It attracted numerous researchers and writers to analyze and extract deep poetic features from various poems. As the literature review shows, there are numerous successful attempts to identify these traits and characteristics such as categorizing the used poetry metric and identifying the poets behind these poems. In our research, we introduce a comprehensive approach to Arabic poetry text classification using deep learning techniques. We have used an almost one-million record dataset of Arabic poetry verses extracted from a poetry encyclopedia. These verses are labeled with different nine poets and cover both classical and modern poetic styles. Due to the complexity of Arabic poetry such as the excessive use of metaphors, figurative language, unlimited imagination, and the diversity of styles from one poet to another and from one poem to another, we tackle these challenges by careful employment of preprocessing steps, feature engineering and selection. We also explore a range of algorithms, including traditional classifiers and deep learning models, to determine and select the most suitable and accurate models of identifying poets' names from the verses. We have decided to employ LSTSM and Bi-LSTM as our main baseline models. The reason behind selecting such models is observing a concentration on RNN (Recurrent Neural Network) and its variants when it comes to text classification. LSTM has proven its capability for sequential data analysis in many different languages. Our reported results have shown promising classification accuracy with an average of 92.35%. This sheds some light on the feasibility of automating the classification of a morphologically complex language text (Arabic). Bi-LSTM has slightly outperformed the classic LSTM in normal situation with average accuracy of 92.15% and 92.56% for LSTM and Bi-LSTM respectively. We discuss what would be the impact of our research findings on Arabic literature in particularly Arabic poetry. We also address the challenges associated with this study.
Arabic poetry, text classification, RNN, LSTM, Bi-LSTM