Deep multi-level feature pyramids: Application for non-canonical firearm detection in video surveillance

Citation

Lim, Jun Yi and Al Jobayer, Md Istiaque and Baskaran, Vishnu Monn and Lim, Joanne Mun Yee and See, John Su Yang and Wong, Kok Sheik (2021) Deep multi-level feature pyramids: Application for non-canonical firearm detection in video surveillance. Engineering Applications of Artificial Intelligence, 97. p. 104094. ISSN 0952-1976

[img] Text
Deep multi-level feature pyramids Application for non-canonical firearm detection in video surveillance.pdf
Restricted to Repository staff only

Download (5MB)

Abstract

The epidemic of gun violence worldwide necessitates the need for an active-based video surveillance network to combat this crime. In this context, autonomously detecting handguns is crucial in capturing firearm-related crimes. However, current object detectors using deep learning are unable to capture handguns at different scales in an unconstrained environment. Hence, this paper puts forward an enhanced deep multi-level feature pyramid network that addresses the difficulty in inferring handguns from a non-canonical perspective. We first construct a dataset containing handguns in an unconstrained environment for representation learning. The dataset is constructed from a set of 250 recorded videos and with over 2500 distinct labeled frames. Crucially, these labeled frames account for the absence of a proper video surveillance-based handgun dataset. We then train the dataset on a multi-level multi-scale object detector, i.e., M2Det. We further improve the performance of M2Det by: (1) Enhancing the base features by concatenating shallow, medium and deep features from the backbone according to its relative receptive field; (2) Implementing generalized intersection-over-union as its localization loss; and (3) Integrating Focal Loss as its classification loss to improve detection of small-scale handguns. Experiments on a challenging video surveillance test dataset demonstrate that the proposed model achieves 87.42% accuracy. In addition, we implement adaptive surveillance image partitioning to redetect handguns at specific regions. This method potentially solves the challenge of sporadically poor real-world handgun classifications. This model is capable of pioneering non-canonical handgun detection for active-based video surveillance systems. The dataset and trained models are available at https://github.com/MarcusLimJunYi/Monash-Guns-Dataset.

Item Type: Article
Uncontrolled Keywords: Video surveillance, Active video surveillance, non-canonical firearm detection, deep neural network, multi-level feature pyramids
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK5101-6720 Telecommunication. Including telegraphy, telephone, radio, radar, television
Divisions: Faculty of Computing and Informatics (FCI)
Depositing User: Ms Nurul Iqtiani Ahmad
Date Deposited: 09 Mar 2021 00:44
Last Modified: 09 Mar 2021 00:44
URII: http://shdl.mmu.edu.my/id/eprint/8562

Downloads

Downloads per month over past year

View ItemEdit (login required)