Dissertations for Informatics (Knowledge and Data Management)

Permanent URI for this collection


Recent Submissions

Now showing 1 - 20 of 100
  • Item
    The Impact of Industry 5.0 Technologies on the Performance within the Healthcare Sector through the Integration of Gamification-Based Learning Programs
    (The British University in Dubai (BUiD), 2023-06) Alsuwaidi, Laila
    Education continues to evolve in response to its influence in a world increasingly dominated by technology. This study thoroughly explores applying and accepting Gamification-Based Learning Programs within the healthcare sector. This extensive study was conducted through the lens of Gamification-based Learning Programs in the healthcare industry with the primary objective of realizing how learning techniques open the way for integrating technology into many professional fields. This study presented novel findings based on the Unified Theory of Acceptance and Usage of Technology (UTAUT2), embracing the importance of mobility, facilitating conditions, habit, and availability in encouraging the use of GBLPs. However, traditional considerations such as performance expectancy, effort expectancy, social influence, hedonic motivation, and price value have little effect on the decision to accept technology, necessitating a reevaluation of the reasons for technological use. This research used a hybrid SEM-ANN strategy, highlighting its potential to make accurate and consequential projections regarding the future of Gamification-based Learning Programs applications. While this study focused on the healthcare industry, the lessons learned apply to other professional learning settings and may help spark a revolution in how we know in many fields. The report continues by pointing out its limitations and calling for more research into this area from various professions. The research could cover the way for future research into the model, encouraging confidence in its ability to forecast technological uptake and serve as a roadmap for crafting more exciting and effective educational environments.
  • Item
    Evaluation of I4.0 Technologies adoption approaches within the SMEs operations based on Multi-Criteria Analysis
    (The British University in Dubai (BUiD), 2023-03) ABU-LAIL, DAREEN
    With the introduction of business intelligence and the necessity to make informed and data-driven decisions. It is becoming a challenge for the decision makers and researchers to integrate data science into day to day operations and the decision making process. More specifically, one of the main impacted stakeholders in the business area are the Small Medium Enterprises (SMEs) who are struggling to catch up with the digital transformation that big companies are leading in for several technological, organizational and environmental factors. This study comes as a solution to propose a decision-making model specified for the challenges and SMEs context to make digital transformation efforts with the focus on I4.0 technological adoption approaches. So far, no research has given a methodology for comparing I4.0 technology adoption options within the SMEs sector in order to determine the optimum strategy that may be recommended to SMEs decision makers. The process of analyzing and comparing I4.0 technical methods for SMEs is known as Multi-Criteria Decision-Making (MCDM). The Fuzzy Decision by Opinion Score (FDOSM) and Fuzzy-Weighted Zero-Inconsistency (FWZIC) Method are two of the most preferred MCDM ranking methods. The FDOSM was designed to function in a variety of fuzzy set contexts to cope with the ambiguity and uncertainty caused by the fact that expert input is subjective. (FWZIC) Method for determining criterion weight coefficients with zero consistency. To calculate its relevance level in the decision-making process, this technique relies on variances in expert preference per criteria. This study has used FWZIC and FDOSM to provide and overall evaluation for I4.0 technological adoption approaches and to assess the top five ranked adoption approaches against the Innovation Process Framework that consists of the Technological, Organizational and Environmental aspects (TOE). Prior to this application the research gap was identified and the benchmarking framework requirements were developed to decide on the best MCDM methods. The study consists of three main components. The application of FWZIC on the I4.0 technologies as criteria then benchmark the I4.0 adoption approaches to come up with the top five best adoption approaches for I4.0 within the SMEs using FDOSM and finally to apply FWZIC on the innovative TOE framework and evaluate the top five ranked approaches in relation to the TOE framework established weights. Decision matrices were developed along with the weighted matrices and the final phase resulted in formulation with two different weighting matrices using FWZIC and an evaluation framework according to MCDM analysis using the FDOSM application. An application of evaluation and validation using benchmarking framework analysis, and systematic ranking was applied in order to ensure a validated framework.
  • Item
    Exploring the Impact of Explainable Artificial Intelligence on Decision-making in Healthcare
    (The British University in Dubai, 2023-07) MOHAMMAD, AHMAD HASAN
    As artificial intelligence (AI) advances in healthcare, there is an increasing need to understand how AI-driven decision-making affects healthcare workers and patients. The development of explainable artificial intelligence (XAI) systems, which attempt to give visible and interpretable explanations for AI algorithms' judgements, is a vital part of AI in healthcare. This study investigates the influence of XAI on healthcare decision-making and its potential to improve trust, acceptance, and collaboration between AI systems and human decision-makers. The study analyses the benefits and limitations of applying XAI in healthcare decision-making processes through an exhaustive analysis of current literature and empirical data. It investigates how XAI might increase AI algorithm transparency, allowing healthcare practitioners to better comprehend the reasoning behind AI-generated suggestions or forecasts. Furthermore, it investigates how XAI might help to enhance trust among healthcare professionals, patients, and other stakeholders, leading to better informed and collaborative decision-making processes. The study also tackles possible barriers to XAI deployment in healthcare. The complexity of AI algorithms, the interpretability of XAI explanations, and the integration of XAI systems into conventional healthcare procedures are among the hurdles. Furthermore, ethical aspects like as privacy, security, and bias mitigation are studied to guarantee that XAI is used responsibly in healthcare decision-making. The outcomes of this study lead to a better understanding of the influence of XAI on healthcare decision-making. This research seeks to give insights for policymakers, healthcare practitioners, and AI developers to support the responsible and successful integration of XAI into healthcare systems by shedding light on the benefits and issues connected with XAI. The ultimate objective is to use XAI to improve healthcare decision-making processes, improve patient outcomes, and allow the ethical and trustworthy deployment of AI in the healthcare sector.
  • Item
    Understanding the Intention to Use the Metaverse in IT Companies Using a Hybrid SEM-ANN Approach.
    (The British University in Dubai, 2023-07-01) ZAMMAR, AHMAD KHALED
    While embracing the metaverse within Information Technology (IT) companies could present unique opportunities, it also brings about challenges in adoption behavior. However, research on the factors influencing intentional behavior to use the metaverse in IT companies is scarce. To bridge this gap, this study develops a research model that integrates elements from the Unified Theory of Acceptance and Use of Technology 2 (UTAUT2), Task-Technology Fit (TTF), and awareness studies, and hypothesizes key variables such as performance expectancy, effort expectancy, and social influence. Through a comprehensive survey of 234 participants, the research model is evaluated employing a unique combination of Structural Equation Modeling (SEM) and Artificial Neural Network (ANN), which serve as advanced modeling techniques. The SEM and ANN analyses elucidate intricate relationships and make predictions about adoption behavior, while uncovering patterns and insights into metaverse adoption in IT companies. Although the primary focus is on SEM and ANN, this study also utilizes Partial Least Squares (PLS) in the research design. It identifies and discusses key findings from descriptive analysis, measurement model assessments, and structural model assessments. Furthermore, the ANN results and sensitivity analysis paint a more nuanced picture of metaverse adoption behavior in the IT sector, providing valuable predictions and insights. In addition to the theoretical contributions, the findings offer practical implications for IT companies and suggest future research directions to help them make informed decisions related to the implementation and use of the metaverse. Overall, this study contributes to the growing body of literature on the metaverse and its application in the business landscape, with specific emphasis on IT companies.
  • Item
    Examining the impact of Industry 5.0 on economic sustainability: A hybrid SEM-ANN approach
    (The British University in Dubai (BUiD), 2023-05) ALSHAMSI, SAIF RASHED
    While there is an existence of a significant amount of literature studies on industrial revolution there exist a scarce knowledge about examining the impact of Industry 5.0 on economic sustainability using a hybrid SEM-ANN approach. The study aims at understanding the impact of industrial 5.0 on economic sustainability. A Quantitative analysis was done with data collected from over 363 participants. Two stage analytical techniques were applied in this work which involves the combination of ANN and with the PLS-SEM. First, PLS-SEM is used to enable understand the predictors and their significant influence on the sustained use of industrial 5.0 revolution. The results show that there is a positive and non-significant relationship between predictors and industrial 5.0 revolution on sustainability. The outcome of this study as provided valuable insight for those countries willing to adopt industrial 5.0 revolution to understand its impact on economic sustainability.
  • Item
    Arabic Hotel Reviews Sentiment Analysis Using Deep Learning
    (The British University in Dubai (BUiD), 2023-06) ALMANSOORI, MOHAMMAD
    Arabic Hotel Feedback sentiment analysis plays a significant role in understanding the opinions and sentiments expressed by customers in their reviews. With the growing popularity of online platforms and social media, Arabic Hotel Feedback have become a valuable source of information for both hotel owners and potential customers. Sentiment analysis techniques aim to automatically classify the sentiment polarity of these reviews as positive, negative, or neutral, providing valuable insights into customer satisfaction and areas of improvement for hotels. In this study, we present a comprehensive analysis of Arabic Hotel Reviews sentiment analysis. We collected a large dataset of Arabic hotel Feedback from various online platforms, encompassing a wide range of hotels and customer experiences. The dataset was carefully annotated with sentiment labels by human annotators to serve as ground truth for training and evaluation purposes. We employed state-of-the-art machine learning and natural language processing techniques to develop sentiment analysis models specifically tailored for the Arabic language. Our models utilized advanced text preprocessing, feature extraction, and classification algorithms to accurately predict sentiment polarity in Arabic hotel reviews. We evaluated the performance of our models using various evaluation metrics, including accuracy, precision, recall, and F1-score, to assess their effectiveness in sentiment classification. The results of our study demonstrate the viability and effectiveness of sentiment analysis in Arabic Hotel Reviews. Our models achieved high accuracy and robust performance in sentiment classification, enabling hotel owners to gain valuable insights into customer sentiments and make informed decisions to enhance customer satisfaction and improve their services. CNN model demonstrated superior performance in terms of precision, recall, F1-score, and accuracy, consistently achieving a score of 74% across all evaluation metrics. The SVM model closely followed with a score of 73% for the same metrics. The LSTM model exhibited slightly lower performance, achieving values between 70% and 71%. On the other hand, the DT model had the lowest scores among all the models, with values of 66% and 68%. The findings of this study contribute to the growing body of research in sentiment analysis and provide valuable insights into sentiment patterns specific to Arabic hotel reviews. Overall, this study highlights the importance of sentiment analysis in the context of Arabic Hotel Feedback and provides a foundation for future research and applications in the field. The insights gained from sentiment analysis can empower hotel owners, marketers, and decision-makers to better understand customer sentiments, address concerns, and optimize their services to meet customer expectations in the dynamic and competitive hotel industry.
  • Item
    Detecting Arabic Cyberbullying Tweets in Arabic Social Using Deep Learning
    (The British University in Dubai (BUiD), 2023-06) ALFALASI, FARIS Jr
    The widespread engagement with social media platforms in recent years has made cyberbullying a significant concern. Individuals may have catastrophic side effects from that as well, including despair, anxiety, and even suicide. Due to the difficulty of manually detecting and categorizing vast volumes of electronic text data, conventional methods for recognizing and combating cyberbullying have not proven successful. As a consequence, deep learning methods have become a potential solution for this situation. Artificial neural networks and other deep learning approaches can automatically identify patterns and features from a massive quantity of data. These methods may be applied to electronic text data analysis to spot cyberbullying-related trends. Techniques for natural language processing may be used to text data to extract useful features like sentiment, emotion, and subjectivity. A sizable dataset of electronic text data was gathered from multiple social media platforms like Twitter, Instagram, YouTube, and many more sites in order to examine cyberbullying in social media using machine learning and deep learning techniques. The data needs to be initially prepared so that deep learning algorithms may be trained on it before cyberbullying analysis can be done. Manually annotated data from a corpus collection was used to label the information for deep learning purposes. Pre-processing is a vital part of the data preparation process for cyberbullying detection. There are several varieties of Arabic, but the three most common are dialect Arabic , Modern Standard Arabic, and Classical Arabic. Because of its widespread use on social media, DA Arabic is the subject of this essay. Based on the existence of cyberbullying, the data was then preprocessed and classified. In this work, two cases of classification were adapted. The first case was 2-classes classification where the data labeled as either cyberbullying or not cyberbullying. The second case was 6-classes classification which consists of six different cyberbullying types. To categorize electronic text in these two cases, deep learning models such as convolutional neural networks and recurrent neural networks and a combination of CNN-RNN were trained on this data. In an independent test set, the trained models were assessed, and they showed promise in identifying cyberbullying via social media. The results that obtained from 2-classes classification showed a superiority of LSTM in terms of accuracy with 95.59%, while the best accuracy in the 6-classes classification gained from implementing CNN with 78.75%. Meanwhile the f1-score results were the highest in LSTM for the 2-lasses and 6-classes classifications with 96.73% , and 89%, respectively. These findings emphasize the potential for deep learning techniques to be applied in the development of automated systems for identifying and combating cyberbullying on social media and show how well they work in detecting cyberbullying.
  • Item
    Arabic Sentiment Analysis for Gulf Opinion Leaders using a Deep Learning Approach Case Study: Covid-19-22
    (The British University in Dubai (BUiD), 2023-07) ALKETBI, SULTAN
    The COVID-19 pandemic has had a profound impact on global health and has affected various populations worldwide. In the Arab world, social media has emerged as a critical platform for expressing opinions, sharing information, and disseminating news related to COVID-19. However, the proliferation of false information and the spread of fear and panic on social media have created a significant problem. This study aims to investigate how Arab populations, including both opinion leaders and the general public, have responded to the COVID-19 pandemic on Twitter. The research focuses on analysing sentiment and developing a deep learning model to detect real news associated with the pandemic in Arabic text. By gathering and analyzing data from Gulf countries, the study provides insights into the sentiments expressed and contributes to understanding how opinion leaders and the general public engage with COVID-19 on Twitter. Additionally, the study evaluates the efficacy of the deep learning model in combating misinformation and highlights the significance of sentiment analysis and news detection in the Arabic language. Data collection was conducted using Twitter's API, focusing on Arabic tweets from Gulf opinion leaders, utilizing specific keywords, hashtags, and user accounts related to COVID-19. The testing phase involved collecting 100,000 tweets from January to June 2022, with an emphasis on quality and relevance, including opinion leaders with significant follower counts and those recognized for their expertise or influence in the field. Overall, this research contributes to understanding the response to COVID-19 on Twitter and provides valuable insights into sentiment analysis and the detection of real news in Arabic text.
  • Item
    Understanding the impact of using Chatbot for shipment delivery toward environmental sustainability using the SEM-ANN approach
    (The British University in Dubai (BUiD), 2023-03) ALSARAYREH, SALLAM SALEM
    This research paper examines chatbot technology acceptance in customer service and shipment delivery, adoption studies, and sustainability. However, little research concerns the effect of the use of chatbots for shipment delivery and the correlation with environmental sustainability. The study aims to evaluate the impact of chatbot use on environmental sustainability in shipment delivery and develop an integrated theoretical framework for chatbot acceptance. An online survey was conducted with 344 participants from UAE residents to test the proposed model, which considers individual, and task-technology fit factors and extends UTAUT2. The study takes a unique approach compared to prior literature by relying on structural equation modeling (SEM) and artificial neural network (ANN) to analyze hypotheses. The study findings revealed that task-technology fit had the most significant effect on sustainable chatbot use for shipment delivery, with an (87%) normalized importance score, followed by social influence (81%), hedonic motivation (78%), habit (72%), individual-technology fit (57%), and facilitating conditions (47%). Sensitivity analysis results further revealed that these factors play a crucial role in shaping consumer attitudes toward chatbot use in the shipment delivery domain and their impact on environmental sustainability. The study provides valuable recommendations for developers, designers, and decision-makers in shipment delivery based on these results. Focusing on these areas can improve the results related to effort and performance expectancy, ensuring that chatbots are used for shipment delivery as effectively and efficiently as possible. However, additional research is required to validate the results and extend the proposed theoretical framework to other domains.
  • Item
    Variational Auto Encoder Approach To Find Deferentially Expressed Genes
    (The British University in Dubai (BUiD), 2022-05) RAHIMAN, NABIL
    A study of differentially expressed genes across different cell types will help in identifying cell-specific responses to treatments or diseases. Recent advances in single-cell technology enable an analysis of thousands of cells which brought lots of computational challenges in terms of noise in the data sets and required computational power to handle the big data. In recent years it has been found that the deep learning model is being used as a biological model for single-cell analysis. Using state-of-the-art techniques in deep learning successfully extracts non-linear feature set from single-cell data and is used for various downstream analysis. Recently, deep learning models such as Autoencoder (AE) and Variational Autoencoder (VAE) models are being used to capture hidden patterns from single-cell gene expression data. In this paper, I proposed a framework that is based on a variational autoencoder called BiDiffVAE (Bi-directional Differential Variational Autoencoder) to extract differently expressed genes. The proposed method makes use of cluster distribution on every latent space and merged weights in the decoder to assign genes to a cluster. My results discovered new sets of genes that were not shown using state-of-the-art techniques and can properly rank the top genes based on their significance in making clustering.
  • Item
    Sentiment Analysis for Arabic Social media Movie Reviews Using Deep Learning
    (The British University in Dubai (BUiD), 2022-10) MEZAHEM, FATEMA HAMAD
    This work is to apply sentiment analysis SA for Arabic movie reviews on social media. Automatically detecting attitude or sentiment in a text is often helpful. By classifying the data into positive, negative, or neutral emotions, SA aids in our understanding of the precise emotions that underlie the more broad feelings that are typically associated with behavior. By utilizing the power of multiple word representations and deep learning approaches, this work seeks to enhance categorization performance. Through the use of mobile apps, the internet, and social media portals, there has been a tremendous increase of data in recent years. People are now able to share their opinions about specific topics because to the rapid development of technologies and social media platforms. People all around the world use a number of these social media sites frequently to share their evaluations and opinions of movies. By evaluating prior evaluations, it has become simpler for individuals to identify movies that live up to their expectations thanks to technologies like machine learning (ML) and deep learning (DL). Massive data can be collected every day from social media network such as YouTube, twitter, Instagram, and many other platforms. The tools used for collecting data are Vicinitas for Twitter and IGCommentExport for Instagram. The testing datasets were collected from mainly from Instagram for two Arabic movies reviews. The two movies are Wahed Tani which translates to (someone else) and Amahom which translate to (their uncle), Three datasets were employed, and several categorization models were compared across them. Prior to performing sentiment analysis, it is necessary to prepare the data so that it may be used to train machine learning (ML) algorithms. In order to label the data that was gathered from a corpus collection for ML use, manual annotation was made. For sentiment analysis, pre-processing is a crucial step in the data preparation process. Data pre-processing is a crucial step in NLP activities to enhance dataset performance and guarantee the accuracy of the emotive analysis. We translated some of the most common emojis as per its meaning in Arabic. There are different types of Arabic and the three main are Classical Arabic (CA), Modern standard Arabic (MSA), and Dialect Arabic language (DA). In this paper we are focusing on DA Arabic since it is commonly used on social media The main dataset was the Arabic Sentiment Analysis Dataset (ASAD) which presented a novel large Twitter-based benchmark (Alharbi et al., 2020). The proposed CNN, RNN, CNN-RNN, and BERT models were used in conjunction with the three datasets. With the Bert model and in comparison, to the other examined models, two of these datasets were used. We test the CNN model first, then the LSTM, and finally the CNN-LSTM combo. After comparing these three modes, the best mode was chosen in order to compare it to the BERT model. The results of the hybrid CNN-LSTM model showed an accuracy of 90%. Finally, we compared CNN-LSTM with the BERT model Therefore, the BERT model outperformed all other classifiers in terms of accuracy (91%), recall (71%%), precision (83%), and F-measure (77%).
  • Item
    Sentiment Analysis for opinion leaders on Twitter: A Case Study of COVID-19
    (The British University in Dubai (BUiD), 2022-11) MIR, REEM SAJID
    The coronavirus or COVID-19 is an ongoing global problem where a pandemic was implemented early in 2020 during the outbreak. Social media platforms were used during the pandemic to share views and exchange information. This study aims to provide a framework for sentiment analysis of opinion leaders on Twitter. The experiments were conducted by aiming COVID-19 specific tweets from four opinion leaders by applying machine learning models. The dataset collected uses covid hashtags and tweets posted in English. Sentiment analysis are then performed on these tweets for analysis. The tweets are then preprocessed to prepare it for evaluation. This research provides findings from these tweets using sentiment analysis on machine learning models where the logistic regression model provided the best accuracy results followed by the Multi-layer perceptron model, Support vector machine, Convolutional neural network, and Decision tree. As the tweets directly affect people’s thoughts, the purpose of these results was to know about the tweet’s sentiments from diverse public opinion leaders around the world during COVID-19.
  • Item
    Developing a Framework for Weapon and Mask Detection in Surveillance Systems
    (The British University in Dubai (BUiD), 2022-11) ZAHRAWI, MOHAMMAD HUSNI
    Financial institutions, jewelry stores, hypermarkets and automated teller machines all experience yearly thefts of vast amount of money. Police have dismantled a few of the robbery attempts. Police successfully apprehend most of the robbers. The maintenance of safety and security around the globe is a difficult task for governments, particularly in a country like the UAE, which is home to more than 200 nationalities. This study examines the applications of neural network models in video surveillance systems for detecting weapons, thus preventing robberies. By expanding the dataset to include more classes and photos per class, the proposed model could perform better to be installed on outdoor surveillance systems. In this study, we will examine situations of weapons detectors, develop models using transfer learning approaches, and contrast them with other contemporary detectors like YOLOv5. We will develop our own unique dataset and contrast it with another dataset in terms of classes, image quality, and kind of items used for committing a robbery. Gun detectors in surveillance systems has a wide range of additional uses, from residentials units to the military.
  • Item
    Arabic Sign Language Recognition: A Deep Learning Approach
    (The British University in Dubai (BUiD), 2022-05) ALMAHRI, HAMDA GHALIB AWADH ALI
    With more than 300 sign languages across the world, sign interprets are not always available to translate spoken words into sign language and vice versa. As people with hearing and speech impairments rely on Sign Language for communication, this would limit their communication with others. A solution for this would be utilizing Sign Language Recognition systems, which allow for communication between users of the sign language and those who do not without the need for interpreters. As we consider the success of Deep Learning for Computer Vision tasks, we observe the advantage it can provide for Arabic Sign Language Recognition. For this research, we have two aims. First, we would like to review the current status of research in Arabic Sign Language Recognition using Deep Learning and find research gaps. Second, we aim to build a Sign Language Recognition system that bridges the gap. We achieve this through a systematic review that identifies primary studies using deep learning models for Arabic Sign Language Recognition. Out of 414 identified studies, 67 were deemed of relevance to our topic. Out of those, 32 studies passed our full selection procedure. We were able to discover patterns in research and find that the biggest issue is data collection as current datasets don’t offer enough variety and are not representative of real-life scenarios. Current methods are either too expensive, or easily affected by the surrounding environment. Thus, for the second part, we offer a solution for data collection using MediaPipe, which allow us to collect data directly through the webcam. We are able to leverage this framework to build a recognition system for Emirati Sign Language that recognizes the signs for the seven Emirates. We used an LSTM model and achieve an accuracy of 100% in the testing dataset.
  • Item
    Using Educational Data Mining Techniques in Predicting Grade-4 students’ performance in TIMSS International Assessments in the UAE
    (The British University in Dubai (BUiD), 2018-04) SHWEDEH, FATEN
    Educational Data Mining (EDM) is the process of discovering information and relationships from educational data for better understanding of students’ performance, and characteristics of their education providers. Classification is a Data Mining (DM) technique used for prediction. On the other hand, feature selection is the process of finding the best set of features that has the most impact on a specific target. This dissertation provides an extensive descriptive and predictive analysis on Grade-4 student performance in the Trends in Mathematics and Science Study (TIMSS) in the United Arab Emirates (UAE). The main purpose is to bridge the gap between EDM and International Assessments in the Arab world by applying EDM to predict Grade-4 student levels in TIMSS assessments in the UAE. We examined different feature selection methods and classification algorithms to find the best prediction model with the highest accuracy. The study in this dissertation was expanded to delve deeper into Dubai’s private schools data and discover the important features leading to improvements. In addition to building a prediction model to examine if a school will improve in the future TIMSS assessment cycles. As a result, it was found that the Tree-based feature selection method associated with Decision Tree (DT) classifier built the most accurate prediction models on most TIMSS datasets. The main key factors influencing students’ performance in science is discovered and presented. To the best of our knowledge, this study is the first scientific analysis implementing EDM in the field of international assessments in the UAE. In addition to being the first scientific study that considers all TIMSS questionnaires database in EDM task.
  • Item
    Arabic Image Captioning (AIC): Utilizing Deep Learning and Main Factors Comparison and Prioritization.
    (The British University in Dubai (BUiD), 2022-02) HEJAZI, HANI DAOUD
    Captioning of images has been a major concern for the last decade, with most of the efforts aimed at English captioning. Due to the lack of work done for Arabic, relying on translation as an alternative to creating Arabic captions will lead to accumulating errors during translation and caption prediction. When working with Arabic datasets, preprocessing is crucial, and handling Arabic morphological features such as Nunation requires additional steps. We tested 32 different variables combinations that affect caption generation, including preprocessing, deep learning techniques (LSTM and GRU), dropout, and features extraction (Inception V3, VGG16). Moreover, our results on the only publicly available Arabic Dataset outperform the best result with BLEU-1=36.5, BLEU-2=21.4, BLEU-3=12 and BLEU4=6.6. As a result of this study, we demonstrated that using Arabic preprocessing and VGG16 image features extraction enhanced Arabic caption quality, but we saw no measurable difference when using Dropout or LSTM instead of GRU.
  • Item
    Use of Data Mining Techniques to Detect Fraud in Procurement Sector
    (The British University in Dubai (BUiD), 2022-01) AL HAMMADI, SUMAYYA ABDULLA
    Procurement is an extensive and complex sector in the manufacturing industry, and has attracted an extensive and wide-spreading fraud that directly impacts the operation of an organization and economy at large. These fraudulent activities have contributed to rising problems in the manufacturing industry. Several fraud detection systems are being used in the procurement and logistics sector, and their challenge is incapability of realizing the burden of the money lost and abnormal behaviors in the procurement process. Another major problem with the current system is that having an ever-growing amount of data requires a proportional growing number of staff members to analyze the data. In addition, some of the organizations carry out this task manually using their specialized staff. Despite the implementation of various strategies aiming to fight and reduce fraud in the procurement sector, such as random and periodic audits, whistle-blowing, and many others, most of the UAE's organization still uses the manual approach to do these audits and monitoring of the procurement process. This has continued to be a challenge in most of the businesses in the UAE. This research aims to analyze the reliability and efficiency of data mining techniques in detecting and preventing fraud in the procurement sector in the UAE and globally. The method used in this research is a classification of models and algorithms used in data mining. All techniques also will be studied; they include clustering, tracking patterns, classifications and outlier detection. From this study I found out that most of the organizations lose quite a huge amount through fraud in their procurement sector. However, unsupervised data mining techniques are reliable in detecting fraud before they happen. For the research, I found out the importance of data mining in detecting fraud in procurement. Data analytics reflects on the structuring of the data to be usable and accessible to teams or individuals who require information about procurement in a company. This essentially makes it easy to detect fraud and thus prevents it from happening. The findings from this study will help implement a system that will significantly reduce fraud in the procurement sector. It will save companies a lot of money which will positively impact. This study concluded that most of the companies lose money due to fraud. They are willing to invest their money in fraud detection and control systems that will curb fraud. Fraud detection is a field that requires dynamic research and periodical upgrades and innovations because fraudsters are many and skilled; they consistently devise new ways to perform fraud in a less detectable way. From this study, I found out that the use of data mining techniques will help discover entirely invisible patterns and alert the fraudsters. There is a need for the companies to acquire new technology devices and ways to mitigate fraud in their procurement sector.
  • Item
    Adaptive Secure Pipeline for Attacks Detection in Networks with set of Distribution Hosts
    (The British University in Dubai (BUiD), 2022-01) ALSHAMSI, SUROUR
    Currently, malware continues to represent one of the main computer security threats. It is difficult to have efficient detection systems to precisely separate normal behavior from malicious behavior, based on the analysis of network traffic. This is due to the characteristics of malicious and normal traffic, since normal traffic is very complex, diverse and changing; and malware is also changeable, migrates and hides itself pretending to be normal traffic. In addition, there is a large amount of data to analyze and the detection is required in real time to be useful. It is therefore necessary to have an effective mechanism to detect malware and attacks on the network. In order to benefit from multiple different classifiers, and exploit their strengths, the use of ensembling algorithms arises, which combine the results of the individual classifiers into a final result to achieve greater precision and thus a better result. This can also be applied to cybersecurity problems, in particular to the detection of malware and attacks through the analysis of network traffic, a challenge that we have raised in this thesis. The research work carried out, in relation to attack detection ensemble learning, mainly aims to increase the performance of machine learning algorithms by combining their results. Most of the studies propose the use of some technique, existing ensemble learning or created by the authors, to detect some type of attack in particular and not attacks in general. So far none addresses the use of Threat Intelligence (IT) data in Ensemble Learning algorithms to improve the detection process, nor does it work as a function of time, that is, taking into account what happens on the network in a limited time interval. The objective of this thesis is to propose a methodology to apply ensembling in the detection of infected hosts considering these two aspects. As a function of the proposed objective, ensembling algorithms applicable to network security have been investigated and evaluated, and a methodology for detecting infected PAGE 2 hosts using ensembling has been developed, based on experiments designed and tested with real datasets. This methodology proposes to carry out the process of detecting infected hosts in three phases. These phases are carried out each a certain amount of time. Each of them applies ensembling with different objectives. The first phase is done to classify each network flow belonging to the time window, as malware or normal. The second phase applies it to classify the traffic between an origin and a destination, as malicious or normal, indicating whether it is part of an infection. And finally, the third phase, in order to classify each host as infected or not infected, considering the hosts that originate the communications. The implementation in phases allows us to solve, in each one of them, one aspect of the problem, and in turn take the predictions of the previous phase, which are combined with the analysis of the phase itself to achieve better results. In addition, it implies carrying out the training and testing process in each phase. Since the best model is obtained from training, each time it is performed for a given phase, the model is adjusted to detect new attacks. This represents an advantage over tools based on firm rules or static rules, where you have to know the behavior to add new rules.
  • Item
    Exploring Sentiment Analysis using Different Machine Learning Algorithms on Dialectal Arabic
    (The British University in Dubai (BUiD), 2021-04) AL MANSOORI, MOUZA
    Today, the intense use and relaying on the modern technologies such as; mobile phones, e-commerce, interactive websites, social media, wearables technologies, sensors, and satellites, is enabling data to be generated every second resulting in huge structured and unstructured data availability. Therefore, big data analytics field emerged to tame the generated big data, and use it to provide useful insights to the world. Sentiment Analysis is one application of big data analysis when dealing with text data. Sentiment Analysis refers to the processes of extracting and analysing emotions from a given text to classify its polarity, mainly within three classes; positive, negative and neutral. Many researches have been done on Sentiment Analysis for English text data. While more exploration is still required to be done on Arabic twitter sentiment analysis. This paper focuses on dialectal Arabic sentiment analysis. The study explores sentiment analysis using different machine learning algorithms on dialectal Arabic text dataset. In this study, we used twitter as our data source. Therefore, our dataset consists of Arabic tweets. The purpose of this study is to examine the performance of sentiment analysis on three datasets that have different level of dialectal Arabic; mixed dialects dataset, gulf dialect dataset, and Emirati dialect dataset. Two machine learning classifiers were used in this experiment; the support vector machine (SVM) and Naïve Bayes (NB). The results of this experiments indicate that when applying sentiment analysis on one specific dialect group, the performance accuracy is higher than the performance of sentiment analysis on the mix dialects dataset under same settings. The experiment also supports other studies in that the SVM classifier outperformed NB classifier. We conclude that additional research is required to be done to explore more on Arabic sentiment analysis considering different dialects.
  • Item
    Comparative Study of Deep Learning Models for Unimodal & Multimodal Disaster Data for Effective Disaster Management
    (The British University in Dubai (BUiD), 2021-07) MOHAMED, DENA AHMED
    Multimodal data of text and images on social media posts hold valuable information that can be utilized during crisis events. Such information includes requests for help, rescue efforts, warnings, infrastructure damage, missing people, injured or dead individuals, volunteers, donations, and many more. Many studies focus only on the text modalities, single classification tasks and small-scale home-grown datasets when studying how useful social media data can be for emergency services. In this study, a multimodal deep learning system for automatic classification of disaster tweets was developed. Two classification tasks were tackled, which are informativeness and the humanitarian category. An extensive comparison between unimodal text-only, unimodal image-only and multimodal deep learning models across three different representative disaster datasets (CrisisMMD, CrisisNLP, and CrisisLex26) was done. Convolutional neural networks are utilized for defining the deep learning architectures. Experiments across the multiple settings and datasets show that multimodal models perform better than their unimodal counterparts. It was also found that mapping between the diverse humanitarian categories and consolidating smaller datasets with larger ones significantly improves the models’ performance when compared to individual datasets. The consolidated dataset can serve as a new baseline multimodal dataset for further research directions.