Citation
Al Farid, Fahmid (2022) Vision-Based Hand Gesture recognition with multiple view angles. PhD thesis, Multimedia University. Full text not available from this repository.Abstract
With today’s enormous population, innovative human-computer interaction systems and approaches can be employed to improve our quality of life. The disabled, as well as others, can benefit from vision-based gesture recognition technologies. Due to the substantial variability in the properties of each gesture with regard to various people, gesture recognition from video frames is a tough problem. In this work, we provide a vision-based hand gesture recognition method that uses RGB video input to generate image frames. Gesture-based systems are more intuitive, spontaneous, and straightforward. Previous work attempted to recognise hand gestures for different kinds of scenarios. According to our studies, gesture recognition systems can be based on wearable sensors or vision-based. Our proposed method is applied to a vision-based gesture recognition system. In this research, we are solving the problem of recognizing gestures on multiple view angles. Firstly, in our proposed system, image acquisition starts from RGB video capture using the Kinect sensor. We convert the image frames one after another from videos to blur for background noise removal. Then, we convert the images of the whole video into HSV colour mode. After that, we do the dilation, erosion, filtering, and thresholding operations on the images. We turn the images into black and white using these morphological image processing techniques. Secondly, we propose applying deep learning methods and convolutional neural networks (CNN) to recognise automated hand gestures using RGB and depth data. To train neural networks to detect hand gestures, any of these forms of data may be utilized. Our technique is evaluated using a vision-based gesture recognition system. In our suggested technique, image collection starts with RGB video and depth information captured with the Kinect sensor and is followed by tracking the hand using a single shot detector CNN (SSD CNN). When the kernel is applied, it creates an output value at each of the m × n locations. Using a collection of convolutional filters, each new feature layer generates a defined set of gesture detection predictions. After that, we perform deep dilation to make the gesture in the image masks more visible. For each gesture video frame, we get one trajectory, which is the feature vector. These are the input features for gesture classification. Finally, using the prominent classification algorithm SVM, we recognize the hand gestures on the SKIG dataset with a higher accuracy of 91.01% from RGB compared to the state-of-the-art in hand crafting methods. In the context of our own created multiple view angle gesture (MVAG) datasets, we got an accuracy of 89.41% for gestures created at a 0-degree angle and 84.61% for gestures created at a 45-degree angle. However, using deep learning, we recognize hand gestures with higher accuracy with 93.68% in RGB passage, 83.45% in the depth passage, and 90.61% in RGB-D conjunction on the SKIG dataset compared to the state-of-the-art. In the context of the MVAG dataset, we also got higher accuracy of 91.68% in RGB passage, 81.45% in the depth passage, and 88.61% in RGB-D conjunction for the gestures collected at a 0- degree angle. On the other hand, we got an accuracy of 89.62% in RGB passage, 80.55% in the depth passage, and 88.61% in RGB-D conjunction for the gestures collected at a 45-degree angle. In addition, we got an accuracy of 87.52% in RGB passage, 79.65% in the depth passage, and 86.51% in RGB-D conjunction for the gestures collected at a 60-degree angle. Moreover, the framework intends to use unique methodologies to construct a superior vision-based hand gesture recognition system. The accuracy is quite promising, regardless of the orientation of the camera used. The performance of all the proposed algorithms in this thesis is evaluated using benchmarking image databases, namely, SKIG and our own created dataset, MVAG. From the experimental results, it is found that the proposed methods yield better results compared to the existing methods.
Item Type: | Thesis (PhD) |
---|---|
Additional Information: | Call No.: TA1652 .F34 2022 |
Uncontrolled Keywords: | Gesture recognition |
Subjects: | T Technology > TA Engineering (General). Civil engineering (General) > TA1501-1820 Applied optics. Photonics |
Divisions: | Faculty of Computing and Informatics (FCI) |
Depositing User: | Ms Nurul Iqtiani Ahmad |
Date Deposited: | 28 Nov 2023 03:55 |
Last Modified: | 28 Nov 2023 03:55 |
URII: | http://shdl.mmu.edu.my/id/eprint/11871 |
Downloads
Downloads per month over past year
Edit (login required) |