An integrated meta-learning and explainable analytics approach for clinical thyroid disease classification

Citation

Assaduzzaman, Md and Bijoy, Md. Hasan Imam and Alam, Mohammad Jahangir and Hasan, Md Zahid and Fahad, Nafiz and Liew, Tze Hui and Ohidujjaman, Ohidujjaman (2026) An integrated meta-learning and explainable analytics approach for clinical thyroid disease classification. Healthcare Analytics, 9. p. 100461. ISSN 2772-4425

[img] Text
5.pdf - Published Version
Restricted to Repository staff only

Download (9MB)

Abstract

Thyroid disorders affect millions globally, underscoring the urgent need for accurate and reliable diagnostic tools. Conventional diagnostic methods are often time-consuming, invasive, and prone to inconclusive results, whereas existing machine learning (ML) approaches continue to face persistent challenges with outliers, data imbalance, interpretability, and generalization. To address these challenges, this study proposes a robust metalearning framework that integrates hybrid outlier handling, feature selection, Bayesian hyperparameter optimization, and explainable artificial intelligence (XAI) for binary classification of thyroid disease. This study introduces a hybrid outlier-handling framework combining univariate Interquartile Range (IQR) analysis, multivariate Isolation Forest detection, and regression-based contextual imputation. Class imbalance was mitigated using Random Oversampling (ROS), and key predictive features were identified using a Recursive Feature Elimination (RFE). The selected features were used to train Random Forest and XGBoost, which were subsequently combined in a stacking ensemble with a logistic regression meta-learner. The proposed framework demonstrated state-of-the-art performance, achieving an accuracy of 99.74%, an Area Under the Receiver Operating Characteristic Curve (AUC-ROC) of 0.9994, and a Cohen's Kappa score of 0.9769. Stratified 10-fold cross-validation confirmed its stability with an average accuracy of 99.70%, highlighting strong generalization. Robustness tests under adversarial perturbations (ε = 0.01, 0.05, 0.1) and Gaussian noise demonstrated minimal performance degradation, with accuracies consistently above 96%. Model transparency is achieved using SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME), which provide global and local explanations of feature contributions. Overall, the proposed framework demonstrates high accuracy, robustness, and transparency, supporting its suitability for real-world AI-assisted thyroid disease diagnosis.

Item Type: Article
Uncontrolled Keywords: Machine learning
Subjects: R Medicine > R Medicine (General) > R858-859.7 Computer applications to medicine. Medical informatics
Divisions: Faculty of Information Science and Technology (FIST)
Depositing User: Ms Rosnani Abd Wahab
Date Deposited: 04 May 2026 02:02
Last Modified: 07 May 2026 08:03
URII: http://shdl.mmu.edu.my/id/eprint/15822

Downloads

Downloads per month over past year

View ItemEdit (login required)