Artificial Intelligence Frameworks for Sentiment Variations’ Reasoning and Emerging Topic Detection

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
The British University in Dubai (BUiD)
Utilizing Sentiment Analysis techniques to monitor public opinions on social media has been an essential yet challenging task in the field of Artificial Intelligence (AI). Many studies were conducted during the last two decades to help users tracking public sentiments about entities, products, events, or other targets. However, these techniques focus on extracting overall positive/negative/neutral polarity of texts without identifying the main reasons for extracted sentiments. This thesis contributes to the very few studies that took one step ahead by developing novel models to understand what causes sentiment changes over time. Obviously, identifying main reasons for public reactions is valuable to decision-makers so that they can take necessary actions in a timely manner. To develop our approach, we first examined existing Sentiment Reason Mining methods to identify their limitations, then we introduced our Filtered Latent Dirichlet Allocation (Filtered-LDA) Model that overcomes major deficiencies of base methods. This model can be used for multiple applications, including detection of new research trends from large sets of scientific papers, discovery of hot topics on social media, comparison of customer reviews for two products to identify their strong/weak aspects, and our focus topic of interpreting public sentiment variations. The Filtered-LDA Model utilizes a novel Emerging Topic Detection technique for which we developed multiple AI frameworks. It emulates human approach for discovering new topics from a large set of documents. A human would first skim through all old and new documents to isolate the new ones that may contain Emerging Topics. These clustered documents are then analyzed to identify the high-frequency emerging topics. With this simple method, the impact of clustering errors is significantly reduced as the wrongly clustered documents do not usually contain main keywords of high-frequency emerging topics. Furthermore, the new frameworks introduce measures to genuinely reduce chances of detecting old topics and visualize candidate reasons online. Given that some social media platforms, like twitter, use short-text documents, we first compared accuracies of state-of-the-art Sentiment Analysis classifiers to select the best performer for short-texts. Subsequently, the selected classifier is applied on a real-life large Twitter dataset, which includes around two million tweets, to extract positive/negative/neutral sentiments. The Filtered-LDA Model is tested first on a Ground Truth dataset to validate that it outperforms baseline models, then it is finally applied on the large Twitter dataset to automatically conclude main reasons for sentiment variations
artificial intelligence (AI), sentiment variations’ reasoning, emerging topic detection, sentiment analysis techniques, social media, public reactions, Filtered Latent Dirichlet Allocation (Filtered-LDA), social media platforms