Efficient and Interpretable Otoscopic Image Classification via Distilled CNN with Adaptive Channel Attention

Citation

Rehman, Zaka Ur and Fauzi, Mohammad Faizal Ahmad and Lokman, Farhana Nur Iman and Touhami, Meriem and Saim, Lokman (2025) Efficient and Interpretable Otoscopic Image Classification via Distilled CNN with Adaptive Channel Attention. IEEE Access. p. 1. ISSN 2169-3536

[img] Text
3.pdf - Published Version
Restricted to Repository staff only

Download (6MB)

Abstract

Accurate classification of otoscopic ear images is crucial for early diagnosis of ear pathologies such as Chronic Otitis Media, Earwax Plug, and Myringosclerosis. In this study, we propose a novel deep learning framework that employs a knowledge distillation strategy, wherein a high-capacity pre-trained teacher model (Vision Transformer or ResNet101) transfers learned representations to a lightweight student CNN model (EfficientNet-B0). The student network is further enhanced through the integration of an Adaptive Channel Attention (ACA) module, which selectively emphasizes informative features via channel-wise recalibration. The multi-scale feature distillation from the teacher improves generalization while the ACA block boosts sensitivity to clinically relevant regions. We validated our method on a publicly available otoscopic image dataset comprising four balanced classes. Our approach achieved an overall accuracy of 98.75%, with class-wise AUC values of 1.000 (CSOM), 1.000 (Earwax Plug), 0.994 (Myringosclerosis), and 0.992 (Normal), and a micro-average AUC of 0.997. Additionally, Grad-CAM analysis confirmed the model’s focus on diagnostically meaningful areas, supporting the interpretability of the predictions. These results demonstrate the effectiveness of our distillation-based ACA-enhanced architecture in otoscopic image classification, with potential to assist clinical decision-making in primary care and telemedicine applications.

Item Type: Article
Uncontrolled Keywords: Otoscopic Image Classification, Adaptive Token Extraction, Ear Disease Diagnosis, Chronic Otitis Media, Myringosclerosis.
Subjects: T Technology > TR Photography > TR624-835 Applied photography Including artistic, commercial, medical photography, photocopying processes
Divisions: Faculty of Artificial Intelligence & Engineering (FAIE)
Depositing User: Ms Suzilawati Abu Samah
Date Deposited: 27 Aug 2025 03:54
Last Modified: 27 Aug 2025 03:54
URII: http://shdl.mmu.edu.my/id/eprint/14435

Downloads

Downloads per month over past year

View ItemEdit (login required)