Main Menu

ConMatFormer: A multi-attention and transformer integrated ConvNext based deep learning model for enhanced diabetic foot ulcer classification

Citation

Rifat, Raihan Ahamed and Bhoyan, Fuyad Hasan and Mehedi, Md Humaion Kabir and Hossain, Md Kaviul and Hossen, Md. Jakir and Mridha, M. F. (2025) ConMatFormer: A multi-attention and transformer integrated ConvNext based deep learning model for enhanced diabetic foot ulcer classification. Results in Engineering, 28. p. 108248. ISSN 2590-1230

Text
ConMatFormer_ A multi-attention and transformer integrated ConvNext based deep learning model for enhanced diabetic foot ulcer classification.pdf - Published Version
Restricted to Repository staff only
Download (4MB)

Official URL: https://doi.org/10.1016/j.rineng.2025.108248

Abstract

Diabetic foot ulcer (DFU) detection is a clinically significant yet challenging task due to the scarcity and variability of publicly available datasets. Limited annotated samples restrict the ability of conventional deep learning models to achieve robust generalization in real-world clinical scenarios. To solve these problems, we propose ConMatFormer, a new hybrid deep learning architecture that combines ConvNeXt blocks, multiple attention mechanisms convolutional block attention module (CBAM) and dual attention network (DANet), and transformer modules in a way that works together. This design facilitates the extraction of better local features and understanding of the global context, which allows us to model small skin patterns across different types of DFU very accurately. To address the class imbalance, we used data augmentation methods. A ConvNeXt block was used to obtain detailed local features in the initial stages. Subsequently, we compiled the model by adding a transformer module to enhance long-range dependency. This enabled us to pinpoint the DFU classes that were underrepresented or constituted minorities. Tests on the DS1 (DFUC2021) and DS2 (diabetic foot ulcer (DFU)) datasets showed that ConMatFormer outperformed state-of-the-art (SOTA) convolutional neural network (CNN) and Vision Transformer (ViT) models in terms of accuracy, reliability, and exibility. The proposed method achieved an accuracy of 0.8961 and a precision of 0.9160 in a single experiment, which is a significant improvement over the current standards for classifying DFUs. In addition, by 4-fold cross-validation, the proposed model achieved an accuracy of 0.9755 with a standard deviation of only 0.0031. We further applied explainable articial intelligence (XAI) methods, such as Grad-CAM, Grad-CAM++, and LIME, to consistently monitor the transparency and trustworthiness of the decision-making process. These human-readable tools enhance the comprehension of the explanations and can substantially increase the practical use of our methodology. Our findings set a new benchmark for DFU classication and provide a hybrid attention transformer framework for medical image analysis.

Item Type:	Article
Uncontrolled Keywords:	Diabetic foot ulcers classification
Subjects:	Q Science > Q Science (General) > Q300-390 Cybernetics R Medicine > RC Internal medicine
Divisions:	Faculty of Engineering and Technology (FET)
Depositing User:	Ms Rosnani Abd Wahab
Date Deposited:	22 Dec 2025 03:08
Last Modified:	26 Dec 2025 03:27
URII:	http://shdl.mmu.edu.my/id/eprint/15088

Downloads

Downloads per month over past year

Edit (login required)