Explainability-Driven Comparison of Machine Learning Approaches for Breast Cancer Classification

Citation

Chakraborty, Shuvo and Tusher, Ekramul Haque and Rabbi, Riadul Islam and Liew, Tze Hui and Shorif, Md Hasan and Hossain, Mohammad Imtiaz (2025) Explainability-Driven Comparison of Machine Learning Approaches for Breast Cancer Classification. In: 2025 8th International Conference on New Media Studies (CONMEDIA), 14-17 October 2025, Malacca, Malaysia.

[img] Text
22.pdf - Published Version
Restricted to Repository staff only

Download (1MB)

Abstract

Breast cancer is still a major health problem around the world, and getting a diagnosis early and correctly is important for improving patient survival rates. Even while they work, traditional diagnostic methods can be costly, take a long time, and be wrong. Therefore, creating computational tools that can help clinicians make quick, accurate, and understandable forecasts of malignancy vs benign instances is the challenge. In order to solve this issue, a variety of machine learning models are applied to the breast cancer dataset, including Support Vector Machines (SVM), Random Forest, Naïve Bayes, Decision Tree, Logistic Regression, and k-Nearest Neighbors. Reducing diagnostic ambiguity, promoting prompt medical intervention, and offering a strong, scalable framework for early detection are the advantages of resolving this. Preprocessing the information, converting categorical diagnoses into binary classifications, scaling features as needed, and training several models for comparison were all part of the work that was done. Following evaluation of the models’ recall, accuracy, precision, and F1 score, SVM hyperparameter adjustment was done to optimize predictive power. With the tweaked SVM attaining the best balance across performance measures(Accuracy = 98.24%, Precision = 100%, Recall = 95.35%, F1 Score = 97.62%), the results showed that ensemble and kernel-based approaches worked well. Furthermore, interpretability was addressed through the use of SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-Agnostic Explanations), which ensured decision-making transparency by highlighting the most significant features influencing predictions. This study introduces a machine learning pipeline that bridges the gap between clinical trustworthiness and computational performance by integrating interpretability and achieving high prediction accuracy.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Breast Cancer Diagnosis; Machine Learning; Support Vector Machine; Naive Bayes; Random Forest; Logistic Regression; SHAP; LIME
Subjects: R Medicine > RC Internal medicine > RC0254 Neoplasms. Tumors. Oncology (including Cancer)
Divisions: Faculty of Information Science and Technology (FIST)
Depositing User: Ms Suzilawati Abu Samah
Date Deposited: 20 Apr 2026 03:45
Last Modified: 20 Apr 2026 03:45
URII: http://shdl.mmu.edu.my/id/eprint/15775

Downloads

Downloads per month over past year

View ItemEdit (login required)