Imbalanced Data Classification Using Oversampling and Automatic Feature Selection Methods for Undergraduate Student Career Prediction

Citation

Haque, Radiah and Goh, Hui Ngo and Ting, Choo Yee and Quek, Albert and Hasan, Md Rakibul (2024) Imbalanced Data Classification Using Oversampling and Automatic Feature Selection Methods for Undergraduate Student Career Prediction. In: 2024 13th International Conference on Educational and Information Technology (ICEIT), 22-24 March 2024, Chengdu, China.

[img] Text
Imbalanced Data Classification Using Oversampling and Automatic Feature Selection Methods for Undergraduate Student Career Prediction.pdf - Published Version
Restricted to Repository staff only

Download (507kB)

Abstract

The application of machine learning techniques for predicting the career trajectories of fresh undergraduate students has become a crucial strategy for evaluating their potential to secure employment post-graduation or pursue further education. However, for such applications, imbalanced data is a vital issue that needs to be addressed with proper methods. In this paper, the combination of oversampling, using Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN), and feature selection, using Recursive Feature Elimination (RFE) and the Boruta algorithm, is applied. The results show that the SMOTE-based Boruta approach is effective to improve the performance of machine learning classification models for undergraduate student career prediction

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Machine learning, multiclass classification
Subjects: Q Science > Q Science (General) > Q300-390 Cybernetics
Divisions: Faculty of Computing and Informatics (FCI)
Depositing User: Ms Nurul Iqtiani Ahmad
Date Deposited: 03 Jul 2024 02:08
Last Modified: 03 Jul 2024 02:08
URII: http://shdl.mmu.edu.my/id/eprint/12565

Downloads

Downloads per month over past year

View ItemEdit (login required)