Optimising Phishing Detection: A Comparative Analysis of Machine Learning Methods with Feature Selection

Citation

Daniel, Mohamad Asraf and Chong, Siew Chin and Chong, Lee Ying and Wee, Kuok Kwee (2025) Optimising Phishing Detection: A Comparative Analysis of Machine Learning Methods with Feature Selection. Journal of Informatics and Web Engineering, 4 (1). pp. 200-212. ISSN 2821-370X

[img] Text
View of Optimising Phishing Detection_ A Comparative Analysis of Machine Learning Methods with Feature Selection.pdf - Published Version
Restricted to Repository staff only

Download (3MB)

Abstract

Phishing is an act of cybersecurity attackthat tricks people into sharing sensitive data. Due to theinefficiency of the current security technologies, researchers have been payingmuch attention toemploying machine learning methods for phishing detection lately.In our proposed solution,the effectiveness of machine learning techniques with feature selection techniques for phishing detectionis investigated. To be specific,Random Forest (RF) and Artificial Neural Network (ANN) are integrated with feature selection techniques, Principal Component Analysis (PCA) and Recursive Feature Elimination (RFE). The goal was to identify and classify the model with the highest accuracy. The experiments were evaluated using a dataset of 4,898 phishing sites and 6,157 legitimate sites, with the phishing data sourced from Kaggle.com. Our experiments demonstrate that the combination of RF model with PCA achieved 95.83%accuracy, while the ANN model with PCA reached 95.07%accuracy. The incorporation of PCA and RFE not only optimisedthe models' predictive performance but also improved computational efficiency. Overfitting can also be reduced. The experimental results also demonstrate that the proposed ANN with PCA methodoutperforms the state-of-the-art methods.Consequently, this research highlights the potential of combining advanced feature selection techniques with machine learning algorithms to develop robust solutions for phishing detection. Yet, this undoubtedly contributes toa safer internetenvironment.

Item Type: Article
Uncontrolled Keywords: Machine learning
Subjects: Q Science > Q Science (General) > Q300-390 Cybernetics
Divisions: Faculty of Information Science and Technology (FIST)
Depositing User: Ms Rosnani Abd Wahab
Date Deposited: 25 Jun 2025 06:48
Last Modified: 25 Jun 2025 06:48
URII: http://shdl.mmu.edu.my/id/eprint/13994

Downloads

Downloads per month over past year

View ItemEdit (login required)