Citation
Santoso, Heru Agus and Haw, Su Cheng (2023) Improvement of k-Means Clustering Performance on Disease Clustering using Gaussian Mixture Model. Journal of System and Management Sciences, 13 (5). ISSN 1816-6075, 1818-0523
Text
49.pdf - Published Version Restricted to Repository staff only Download (341kB) |
Abstract
. k-Means clustering algorithm is an unsupervised learning, provides no opportunity for a data point to be a member of two or more clusters. In fact, a data point can belong to two or more clusters. In our dataset, a set of particular diseases can be member of different cluster locations. Gaussian Mixture Model (GMM) can solve the problem of this k-Means' hard assignment technique. Preprocessing approach on the dataset was also carried out using PCA after the result of Hopkins statistics far from sufficient for clustering purposes. PCA reduces the dimension of dataset, provides the most informative variables that explain the majority of the data. Hopkins test reached 0.958 after performing PCA, indicates the dataset has high tendency to cluster. Improving the performance of k-Means clustering with GMM using Loglikehood, GMM yielded a better result, i.e., 2.217 as compared to k-Means that yielded - 606.604. It means GMM outperforms k-means in term of model fitness to the dataset.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Clustering, soft-assigment clustering, k-Means, Gaussian Mixture Model. |
Subjects: | Q Science > QA Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science |
Divisions: | Faculty of Computing and Informatics (FCI) |
Depositing User: | Ms Nurul Iqtiani Ahmad |
Date Deposited: | 31 Oct 2023 09:14 |
Last Modified: | 31 Oct 2023 09:14 |
URII: | http://shdl.mmu.edu.my/id/eprint/11806 |
Downloads
Downloads per month over past year
Edit (login required) |