Predict Student Success and Performance factors by analyzing educational data using data mining techniques

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
The British University in Dubai (BUiD)
Academic institutions around the globe strive to become highly reputable and make continuous efforts to improve their students' ability to gain and apply knowledge concepts in the field. The primary outcome of the academic institutions is their student's quality of education. The academic institutions are known for their outcome product that are their students work in the practical field. The educational institutions desire to have beneficial insights to ensure the success of students and to enable them to acquire knowledge and improve their abilities. This enables the institutions to retain students, graduate students on time, make students’ workplace ready and improve the institution’s reputation. The primary aim of the study is to identify key attributes that contribute to the performance of the student. Past research has mainly focused on data related to student academic assessments grades, GPA, and student demographics. The research study includes more aspects like the number of students in class, attendance of the student in class, and due to the fact that the United Arab Emirates is a diversified multicultural country, English Language Proficiency, nationality and age of students and the instructor contributes towards student performance. The research study is performed as experimental analysis and develop models from nine machine learning algorithms including KNN, Naïve Bayes, SVM, Logistic regression, Decision Tree, Random forest, Adaboost, Bagging Classifier, and voting Classifier. The model is then applied to data collected from a reputable university that included 126,698 records with twenty-six (26) initial data attributes. The results show that the Random forest model performed better in terms of accuracy of 90.12% as compared to other models. The attendance in class attribute showed positive correlation while the number of students in class attribute showed negative correlation with the grades. The Future enhancement of the research study is to include more attributes from various aspects and also to further the study to provide recommendations for the students, instructor, and the educational institution.
Education Data Mining (EDM), machine learning, student performance prediction, Naïve Bayes, random forest, support vector machines, logistic regression, decision tree, academic institutions