Transfer Detection of YOLO to Focus CNN’s Attention on Nude Regions for Adult Content Detection

Citation

AlDahoul, Nouar and Abdul Karim, Hezerul and Lye Abdullah, Mohd Haris and Ahmad Fauzi, Mohammad Faizal and Ba Wazir, Abdulaziz Saleh and Mansor, Sarina and See, John Su Yang (2020) Transfer Detection of YOLO to Focus CNN’s Attention on Nude Regions for Adult Content Detection. Symmetry, 13 (1). p. 26. ISSN 2073-8994

[img] Text
138.pdf - Published Version
Restricted to Repository staff only

Download (20MB)

Abstract

Video pornography and nudity detection aim to detect and classify people in videos into nude or normal for censorship purposes. Recent literature has demonstrated pornography detection utilising the convolutional neural network (CNN) to extract features directly from the whole frames and support vector machine (SVM) to classify the extracted features into two categories. However, existing methods were not able to detect the small-scale content of pornography and nudity in frames with diverse backgrounds. This limitation has led to a high false-negative rate (FNR) and misclassification of nude frames as normal ones. In order to address this matter, this paper explores the limitation of the existing convolutional-only approaches focusing the visual attention of CNN on the expected nude regions inside the frames to reduce the FNR. The You Only Look Once (YOLO) object detector was transferred to the pornography and nudity detection application to detect persons as regions of interest (ROIs), which were applied to CNN and SVM for nude/normal classification. Several experiments were conducted to compare the performance of various CNNs and classifiers using our proposed dataset. It was found that ResNet101 with random forest outperformed other models concerning the F1-score of 90.03% and accuracy of 87.75%. Furthermore, an ablation study was performed to demonstrate the impact of adding the YOLO before the CNN. YOLO–CNN was shown to outperform CNN-only in terms of accuracy, which was increased from 85.5% to 89.5%. Additionally, a new benchmark dataset with challenging content, including various human sizes and backgrounds, was proposed.

Item Type: Article
Uncontrolled Keywords: Pornography detection; nudity detection; convolutional neural network; you only look once; feature extraction; visual attention; region of interest
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK5101-6720 Telecommunication. Including telegraphy, telephone, radio, radar, television
Divisions: Faculty of Engineering (FOE)
Depositing User: Ms Rosnani Abd Wahab
Date Deposited: 04 Oct 2021 05:45
Last Modified: 04 Oct 2021 05:45
URII: http://shdl.mmu.edu.my/id/eprint/8469

Downloads

Downloads per month over past year

View ItemEdit (login required)