Dimensionality reduction for predicting student performance in unbalanced data sets


Lim, Theng Wai and Khor, Kok Chin and Ng, Keng Hoong (2019) Dimensionality reduction for predicting student performance in unbalanced data sets. International Journal of Advances in Soft Computing and its Application, 11 (2). pp. 76-86. ISSN 2074-8523

[img] Text
74.pdf - Published Version
Restricted to Repository staff only

Download (197kB)


In this study, we evaluated two data sets from two Portuguese schools for predicting student performance. These data sets contain not only the previous grades of the students, but also the demographic, social and school related features. Both data sets are unbalanced in class distribution and contained some irrelevant features. Such characteristics may cause unsatisfactory True Positive (TP) rates for the Fail grade. This grade is important in prediction but it has a low representation as compared with the Pass grade. To improve prediction, dimensionality reduction was performed on both data sets to generate subsets that contained: (i) features selected by a wrapper approach, and (ii) only previous grade(s). The results showed that dimensionality reduction helped to improve the TP rates for the Fail grade. In addition, good classification accuracies were attained. We also noticed that even though the subsets contain only one previous grade, comparable accuracies can also be achieved.

Item Type: Article
Uncontrolled Keywords: Student Performance Prediction, Data Mining, Dimensionality Reduction
Subjects: Q Science > QA Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science
Divisions: Faculty of Computing and Informatics (FCI)
Depositing User: Ms Rosnani Abd Wahab
Date Deposited: 20 Aug 2021 05:54
Last Modified: 20 Aug 2021 05:54
URII: http://shdl.mmu.edu.my/id/eprint/8802


Downloads per month over past year

View ItemEdit (login required)