Citation
Assaduzzaman, Md and Bijoy, Md. Hasan Imam and Alam, Mohammad Jahangir and Hasan, Md Zahid and Fahad, Nafiz and Liew, Tze Hui and Ohidujjaman, Ohidujjaman (2026) An integrated meta-learning and explainable analytics approach for clinical thyroid disease classification. Healthcare Analytics, 9. p. 100461. ISSN 2772-4425|
Text
5.pdf - Published Version Restricted to Repository staff only Download (9MB) |
Abstract
Thyroid disorders affect millions globally, underscoring the urgent need for accurate and reliable diagnostic tools. Conventional diagnostic methods are often time-consuming, invasive, and prone to inconclusive results, whereas existing machine learning (ML) approaches continue to face persistent challenges with outliers, data imbalance, interpretability, and generalization. To address these challenges, this study proposes a robust metalearning framework that integrates hybrid outlier handling, feature selection, Bayesian hyperparameter optimization, and explainable artificial intelligence (XAI) for binary classification of thyroid disease. This study introduces a hybrid outlier-handling framework combining univariate Interquartile Range (IQR) analysis, multivariate Isolation Forest detection, and regression-based contextual imputation. Class imbalance was mitigated using Random Oversampling (ROS), and key predictive features were identified using a Recursive Feature Elimination (RFE). The selected features were used to train Random Forest and XGBoost, which were subsequently combined in a stacking ensemble with a logistic regression meta-learner. The proposed framework demonstrated state-of-the-art performance, achieving an accuracy of 99.74%, an Area Under the Receiver Operating Characteristic Curve (AUC-ROC) of 0.9994, and a Cohen's Kappa score of 0.9769. Stratified 10-fold cross-validation confirmed its stability with an average accuracy of 99.70%, highlighting strong generalization. Robustness tests under adversarial perturbations (ε = 0.01, 0.05, 0.1) and Gaussian noise demonstrated minimal performance degradation, with accuracies consistently above 96%. Model transparency is achieved using SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME), which provide global and local explanations of feature contributions. Overall, the proposed framework demonstrates high accuracy, robustness, and transparency, supporting its suitability for real-world AI-assisted thyroid disease diagnosis.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | Machine learning |
| Subjects: | R Medicine > R Medicine (General) > R858-859.7 Computer applications to medicine. Medical informatics |
| Divisions: | Faculty of Information Science and Technology (FIST) |
| Depositing User: | Ms Rosnani Abd Wahab |
| Date Deposited: | 04 May 2026 02:02 |
| Last Modified: | 07 May 2026 08:03 |
| URII: | http://shdl.mmu.edu.my/id/eprint/15822 |
Downloads
Downloads per month over past year
Edit (login required) |
