Dissertations for Informatics (Knowledge and Data Management)
Permanent URI for this collection
Browse
Browsing Dissertations for Informatics (Knowledge and Data Management) by Subject "Arabic language"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Arabic Question Answering from diverse data sources(The British University in Dubai (BUiD), 2018-07) KHATER, FERASCurrently, Arabic users are still forced to extract manually the accurate answers of their questions, which is a difficult task with a vast amount of information available on the Internet. Actually, the existing Arabic Question Answering (QA) systems do not meet the users’ needs in terms of performance and scope that cover all types of questions. The motivation behind this research is the need for new approaches to handle all types of questions and answer them beyond the factoid questions. Therefore, we present in this paper a new design of the linguistic approach to develop a reliable Arabic QA system and data source with the ability to address the following challenges: (i) handle both factoid and complex questions in Arabic language, (ii) extract the precise answer from available resources, (iii) evaluate the proposed QA system based on a gold standard data set, and (iv) provide an Arabic Corpus of Occupations (ACO) corpus that has been made freely and publicly available for research purposes. Our QA system is a web application that helps us to get an answer to the question posed from different data sources. Accordingly, we conducted experiments on a set of 230 question from the previously published resources, TREC, CLEF, and Arabic Corpus of Occupations (ACO) corpus. The system performance shows an average precision of 36%, by answering 72 questions, as well as the Recall was 78% and F-Measure was 51%. Besides, the aim that attracted us to build the Arabic Corpus of Occupations (ACO) corpus was the lack of free, annotated and large-scale Arabic resources that can be used in training and testing Arabic QA systems. In this paper, we provide ACO corpus of one million words written in Modern Standard Arabic (MSA). The corpus contains 700 occupations which are analyzed carefully and manually annotated. We use Cohen's Kappa coefficient method to evaluate the reliability of the tagged content. The corpus content has been tagged and assessed by two different groups of taggers. Accordingly, the inter-annotator agreement indicates that the reliability of ACO corpus is almost perfect agreement. As well as, the content of the corpus is highly confidence and reliable according to the result achieved by 90%.Item Arabic Sentiment Analysis for Gulf Opinion Leaders using a Deep Learning Approach Case Study: Covid-19-22(The British University in Dubai (BUiD), 2023-07) ALKETBI, SULTANThe COVID-19 pandemic has had a profound impact on global health and has affected various populations worldwide. In the Arab world, social media has emerged as a critical platform for expressing opinions, sharing information, and disseminating news related to COVID-19. However, the proliferation of false information and the spread of fear and panic on social media have created a significant problem. This study aims to investigate how Arab populations, including both opinion leaders and the general public, have responded to the COVID-19 pandemic on Twitter. The research focuses on analysing sentiment and developing a deep learning model to detect real news associated with the pandemic in Arabic text. By gathering and analyzing data from Gulf countries, the study provides insights into the sentiments expressed and contributes to understanding how opinion leaders and the general public engage with COVID-19 on Twitter. Additionally, the study evaluates the efficacy of the deep learning model in combating misinformation and highlights the significance of sentiment analysis and news detection in the Arabic language. Data collection was conducted using Twitter's API, focusing on Arabic tweets from Gulf opinion leaders, utilizing specific keywords, hashtags, and user accounts related to COVID-19. The testing phase involved collecting 100,000 tweets from January to June 2022, with an emphasis on quality and relevance, including opinion leaders with significant follower counts and those recognized for their expertise or influence in the field. Overall, this research contributes to understanding the response to COVID-19 on Twitter and provides valuable insights into sentiment analysis and the detection of real news in Arabic text.Item Arabic Sentiment Analysis using Machine Learning(The British University in Dubai (BUiD), 2016-09) ATIYAH, SASI FUADSentiment Analysis is a rising field that is gaining popularity every day due to its importance in mining the public opinions, the immense amount of generated data every second over the Internet via social network, microblogs, blogs, forums, consumer websites and other presents a rich field of opinions that are ready to be populated, aggregated and summarized and based on that decision are made. The applications are wide from the classical problems like political campaigns, product reviews to more sophisticated usage in Human Machine Interaction where the detection of the human sentiment plays an important role in a successful machine interaction. In this research we investigated the problem of sentiment analysis in the Arabic language and focus on how to utilize the machine learning-based approach to its maximum by conducting several experiments on several multi-domain dataset and optimize the trained model using parameter optimization and using the findings to establish a predefined best parameter settings to be used on new datasets. The research showed that through parameter optimization, basic machine learning classifiers achieved higher results than other more complex hybrid approaches, in addition, the overall parameters settings were tested on two new datasets and provided very promising results indicating that performance weren’t as a cause of overfitting. The research also explains the issues of testing such well-trained models on an unseen dataset from different sources in the same domain and how it can be solved. The work was concluded by the possible enhancements that can be applied to the work done and a new path for future work that promises a more generalized solution.Item Question Processing for Arabic Question Answering System(The British University in Dubai (BUiD), 2015-05) Al Chalabi, Hani MalufDue to very fast growth of information in the last few decades, getting precise information in real time is becoming increasingly difficult. Search engines such as Google and Yahoo are helping in finding the information but the information provided by them are in the form of documents which consumes a lot of time of the user. Question Answering Systems have emerged as a good alternative to search engines where they produce the desired information in a very precise way in the real time. This saves a lot of time for the user. Question Answering systems are offered with the questions of natural language and proposed output is either the suitable answer recognized in a text or small text crumbs including the answer. There has been a lot of research in the field of English and some European language Question Answering Systems. However, Arabic Question Answering Systems could not match the pace due to some inherent difficulties with the language itself as well as due to lack of tools available to assist the researchers. Therefore, in this dissertation, we will take the challenge to design and develop some modules of Arabic Question Answering Systems. The task of Question Answering can be divided into three phases; Question Analysis, Document Analysis, and Answer Analysis. The part that our dissertation concern is the first phase, i.e., the Question Analysis phase. The question analysis phase consists of two major tasks namely Question Classification and Query Expansion beside other minor tasks such as stop word removal, Part of Speech tagging etc. We have proposed methods to accomplish these two major tasks in Question Analysis phase. We have used Nooj and Arabic WordNet (AWN) to implement our methods. In order to evaluate the performances of the proposed methods, we have used the corpus in Arabic language developed by Y.Benajiba which is available at http://users.dsic.upv.es/~ybenajiba/.Item Towards a Unified Arabic Government Services Chatbot Based on Ontology(The British University in Dubai (BUiD), 2020-09) Areed, SufyanThe amount of information has risen considerably since the beginning of the electronic era. Meanwhile, the relationships between various types of information have become more and more complex. On the other hand, the growing number of users have encouraged researchers to take an advantage of this information and develop techniques which analyze the experiences of customers to accommodate their requirements and satisfy their needs. Since the birth of the term "e-government", the amount of structured and unstructured information has increased dramatically, which has forced all government entities worldwide to provide many call centers with long working hours that may reach to 24/7 in some entities to answer users' questions related government services, regardless of the time wasted by the clients while waiting to get in touch with the agent. In fact, UAE government is not an exception. As far as Dubai government is concerned, some entities have taken a step forward by introducing their own chatbot technology to respond to customer inquiries. However, the challenges fraught with the chatbot development in general in addition to the challenges associated with the Arabic language in particular, made it extremely difficult to design a unified chatbot that is capable to respond to all services provided by Dubai government especially when dealing with the Arabic language. In this study, a novel approach is proposed to extract an Arabic language knowledge base from a previously built ontology for more than 500 services provided by the Dubai government in order to use the extracted knowledge base into a chatbot application through Artificial Intelligence Markup Language (AIML) files. The current ontology is an enhancement which builds on a previously created ontology. Furthermore, a chatbot response algorithm is proposed to respond to government services queries through a hybrid of three different approaches that were executed in a pipeline fashion based on the query complexity. The feasibility of the proposed algorithm has been proven through executing multiple experimental tests for the same set of questions (414 questions) that were performed on the ontology earlier, then it has been compared with the ontology itself and the formal chatbot that has been designed by Smart Dubai Government (Rashid) chatbot. High Accuracy score has been achieved by the proposed algorithm that reached 96% with Recall of 100% and Precision of 96% as well. These results confirmed that the performance of the proposed algorithm could outperform both a previously developed chatbot-based on ontology and Rashid chatbot as well.