Sentiment Analysis for Arabic Social media Movie Reviews Using Deep Learning
The British University in Dubai (BUiD)
This work is to apply sentiment analysis SA for Arabic movie reviews on social media. Automatically detecting attitude or sentiment in a text is often helpful. By classifying the data into positive, negative, or neutral emotions, SA aids in our understanding of the precise emotions that underlie the more broad feelings that are typically associated with behavior. By utilizing the power of multiple word representations and deep learning approaches, this work seeks to enhance categorization performance. Through the use of mobile apps, the internet, and social media portals, there has been a tremendous increase of data in recent years. People are now able to share their opinions about specific topics because to the rapid development of technologies and social media platforms. People all around the world use a number of these social media sites frequently to share their evaluations and opinions of movies. By evaluating prior evaluations, it has become simpler for individuals to identify movies that live up to their expectations thanks to technologies like machine learning (ML) and deep learning (DL). Massive data can be collected every day from social media network such as YouTube, twitter, Instagram, and many other platforms. The tools used for collecting data are Vicinitas for Twitter and IGCommentExport for Instagram. The testing datasets were collected from mainly from Instagram for two Arabic movies reviews. The two movies are Wahed Tani which translates to (someone else) and Amahom which translate to (their uncle), Three datasets were employed, and several categorization models were compared across them. Prior to performing sentiment analysis, it is necessary to prepare the data so that it may be used to train machine learning (ML) algorithms. In order to label the data that was gathered from a corpus collection for ML use, manual annotation was made. For sentiment analysis, pre-processing is a crucial step in the data preparation process. Data pre-processing is a crucial step in NLP activities to enhance dataset performance and guarantee the accuracy of the emotive analysis. We translated some of the most common emojis as per its meaning in Arabic. There are different types of Arabic and the three main are Classical Arabic (CA), Modern standard Arabic (MSA), and Dialect Arabic language (DA). In this paper we are focusing on DA Arabic since it is commonly used on social media The main dataset was the Arabic Sentiment Analysis Dataset (ASAD) which presented a novel large Twitter-based benchmark (Alharbi et al., 2020). The proposed CNN, RNN, CNN-RNN, and BERT models were used in conjunction with the three datasets. With the Bert model and in comparison, to the other examined models, two of these datasets were used. We test the CNN model first, then the LSTM, and finally the CNN-LSTM combo. After comparing these three modes, the best mode was chosen in order to compare it to the BERT model. The results of the hybrid CNN-LSTM model showed an accuracy of 90%. Finally, we compared CNN-LSTM with the BERT model Therefore, the BERT model outperformed all other classifiers in terms of accuracy (91%), recall (71%%), precision (83%), and F-measure (77%).
sentiment analysis, deep learning, Arabic sentiment analysis, social media movie reviews, social media platforms