Enhancing Imbalanced Data Augmentation: A Comparative Study of GANified-SMOTE and Latent Factor Integration

Citation

Ruslan, Rusma Anieza and Arbaiy, Nureize and Lin, Pei Chun (2025) Enhancing Imbalanced Data Augmentation: A Comparative Study of GANified-SMOTE and Latent Factor Integration. Journal of Informatics and Web Engineering, 4 (3). pp. 483-493. ISSN 2821-370X

[img] Text
14850.pdf - Published Version
Restricted to Repository staff only

Download (751kB)

Abstract

One such serious problem in machine learning (ML)is imbalanced datasets. Minority class samples are usually sparse but hold significant meaning. The model can become biased toward the majority class due to unbalanced class distribution. This results in fraudulently high accuracy without being able to detect minority cases. This bias is also most perilous in critical applications, where ignoring minority cases can be highly destructive. To overcome this problem, the Synthetic Minority Oversampling Technique (SMOTE) is one of the most widely used. SMOTE creates balanced class distribution by interpolating between existing minority samples. It creates samples that are too close to one another and can lead to overfitting and limit the generalization of the model. Recent advancements in generative modelling, especially Generative Adversarial Networks (GANs), offer a more effective solution to handle class imbalance. GANs utilizes a generative discriminator structure to produce synthetic data highly similar to real data. A hybrid technique called GANified-SMOTE combines the power of SMOTE with the generation power of GANs to produce more diverse and realistic minority class samples. The technique improves the model strength and eliminates the limitations of traditional oversampling. This paper presents the incorporation of latent factors into the architecture of GANified-SMOTE framework. Latent variables reveal hidden structures and relations in the data, leading to a closer synthetic sample and improving classification accuracy. By incorporating latent factors, this research aims to build a better oversampling method for imbalanced classification sets.

Item Type: Article
Uncontrolled Keywords: Imbalance dataset, SMOTE, generative adversarial networks
Subjects: Q Science > Q Science (General) > Q300-390 Cybernetics
Depositing User: Nurin Syazwani Azmi
Date Deposited: 11 Nov 2025 01:22
Last Modified: 11 Nov 2025 01:22
URII: http://shdl.mmu.edu.my/id/eprint/14850

Downloads

Downloads per month over past year

View ItemEdit (login required)