Exploring Sentiment Analysis using Different Machine Learning Algorithms on Dialectal Arabic
AL MANSOORI, MOUZA
MetadataShow full item record
Today, the intense use and relaying on the modern technologies such as; mobile phones, e-commerce, interactive websites, social media, wearables technologies, sensors, and satellites, is enabling data to be generated every second resulting in huge structured and unstructured data availability. Therefore, big data analytics field emerged to tame the generated big data, and use it to provide useful insights to the world. Sentiment Analysis is one application of big data analysis when dealing with text data. Sentiment Analysis refers to the processes of extracting and analysing emotions from a given text to classify its polarity, mainly within three classes; positive, negative and neutral. Many researches have been done on Sentiment Analysis for English text data. While more exploration is still required to be done on Arabic twitter sentiment analysis. This paper focuses on dialectal Arabic sentiment analysis. The study explores sentiment analysis using different machine learning algorithms on dialectal Arabic text dataset. In this study, we used twitter as our data source. Therefore, our dataset consists of Arabic tweets. The purpose of this study is to examine the performance of sentiment analysis on three datasets that have different level of dialectal Arabic; mixed dialects dataset, gulf dialect dataset, and Emirati dialect dataset. Two machine learning classifiers were used in this experiment; the support vector machine (SVM) and Naïve Bayes (NB). The results of this experiments indicate that when applying sentiment analysis on one specific dialect group, the performance accuracy is higher than the performance of sentiment analysis on the mix dialects dataset under same settings. The experiment also supports other studies in that the SVM classifier outperformed NB classifier. We conclude that additional research is required to be done to explore more on Arabic sentiment analysis considering different dialects.