Data mining approach to predict student's selection of program majors

dc.Location2019 T 58.6 S53
dc.SupervisorProfessor Khaled Shaalan
dc.contributor.authorSIDDARTHA, SHARMILA
dc.date.accessioned2019-10-30T07:42:51Z
dc.date.available2019-10-30T07:42:51Z
dc.date.issued2019-06
dc.description.abstractStudents in higher education do not have access to sufficient information when selecting their program major. Program administrators cannot easily predict majors that will be undersubscribed early enough to take corrective actions. At the same time, institutional databases have large volumes of data relating to student demographic profiles, course grades and academic performance. There is an opportunity to apply data mining to arrive at a model to predict student selection of a major. The nature of academic data relating to student majors is multi class and imbalanced – there is always a niche major with few students enrolled. Hence this needs special considerations within the area of data mining. The purpose of this study is to develop a data mining approach for predicting student's selection of program majors. The approach includes a methodology to manage data mining projects, sampling techniques to handle imbalanced data and multiclass data, a set of classification algorithms to predict and measures to evaluate performance of models. The methodology used in this study is the systematic literature review to source, evaluate and synthesize current information in this domain and the CRISP-DM to deploy data mining activities. Several data mining techniques such as data exploration, visualization, sampling and evaluation are presented and applied to the academic data. Datamining experiments are deployed in RapidMiner using Decision Trees, Naïve Bayes, Random Forest, Support Vector Machines, Artificial Neural Networks and Gradient Boosted Trees. Balanced sampling, SMOTE – oversampling of minority classes is used to compare results using the confusion matrix, F1-score and the balanced accuracy. Cross validation is applied to train and test performance of models. Naïve Bayes, Decision Trees offered the best predictions across the different sampling techniques. This study presents an approach to design and deploy a data mining project that can be used as a basis for developing systems to enable the selection of student majors.en_US
dc.identifier.other20160199
dc.identifier.urihttps://bspace.buid.ac.ae/handle/1234/1509
dc.language.isoenen_US
dc.publisherThe British University in Dubai (BUiD)en_US
dc.subjecthigher educationen_US
dc.subjectartificial neural networksen_US
dc.subjectprogram administratorsen_US
dc.subjectinstitutional databasesen_US
dc.subjectacademic performanceen_US
dc.subjectdata miningen_US
dc.titleData mining approach to predict student's selection of program majorsen_US
dc.typeDissertationen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
20160199.pdf
Size:
4.23 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: