Transfer Learning and Decision Fusion for Real Time Distortion Classification in Laparoscopic Videos


AlDahoul, Nouar and Abdul Karim, Hezerul and Tan, Myles Joshua Toledo and Fermin, Jamie Ledesma (2021) Transfer Learning and Decision Fusion for Real Time Distortion Classification in Laparoscopic Videos. IEEE Access, 9. pp. 115006-115018. ISSN 2169-3536

[img] Text
Transfer Learning and Decision Fusion for....pdf
Restricted to Repository staff only

Download (2MB)


Laparoscopic surgery is a surgical procedure performed by inserting narrow tubes into the abdomen without making large incisions in the skin. It is done with the aid of a video camera. Laparoscopic videos are affected by various distortions during surgery which lead to loss of visual quality. Identification of these distortions is the primary requisite in automated video enhancement systems used to classify the distortions correctly and accordingly select the proper algorithm to enhance video quality. In addition to high accuracy, the speed of distortion classification should be high, and the system must consider real-time conditions. This paper aims to address the issues faced by similar methods by developing a fast and accurate deep learning model for distortion classification. The dataset proposed by the ICIP2020 conference challenge was used for training and evaluation of the proposed method. This challenging dataset contains videos that have five types of distortions such as noise, smoke, uneven illumination, defocus blur, and motion blur with four levels of intensity. This paper discusses the proposed solution which received the first prize in the ICIP2020 challenge. The solution utilized a transfer learning approach to transfer representation from the domain of natural images to the domain of laparoscopic videos. We used a pre-trained ResNet50 convolutional neural network (CNN) to extract informative features that were mapped by support vector machine (SVM) classifiers to various distortion categories. In this work, the problem of multiple distortions in the same video was formulated as a multi-label distortion classification problem. The approach of transfer learning with decision fusion was applied and was found to outperform other solutions in terms of accuracy (83%), F1 score of a single distortion (94.7%), and F1 score of single and multiple distortions (94.9%). In addition, the proposed solution can run in real time with an inference speed of 20 frames per second (FPS).

Item Type: Article
Uncontrolled Keywords: Real-time data processing
Subjects: Q Science > QA Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science > QA76.75-76.765 Computer software
Divisions: Faculty of Engineering (FOE)
Depositing User: Ms Nurul Iqtiani Ahmad
Date Deposited: 30 Aug 2021 14:30
Last Modified: 30 Aug 2021 14:30


Downloads per month over past year

View ItemEdit (login required)