Cyberbullying detection using emotion mining

Citation

Al-Hashedi, Mohammed Yahea Ali Mahyoub (2022) Cyberbullying detection using emotion mining. Masters thesis, Multimedia University.

Full text not available from this repository.
Official URL: http://erep.mmu.edu.my/

Abstract

The expansion of information and communication technologies (ICTs) has led to developments in online communication. Regrettably, such convenience has been abused by online bullies, causing harm to others via threatening, harassing, humiliating, intimidating, manipulating, or controlling targeted victims. Cyberbullying can have a severe impact on a victim’s mental health, ranging from negative emotions (anger, fear, sadness, guilt, etc.) to depression, and even suicidal thoughts. Due to the potential harmful consequences, cyberbullying detection has become a pressing need in Internet usage governance. The research presented in this thesis is motivated by the fact that negative emotions can be caused by cyberbullying and proposes cyberbullying detection models that are trained based on contextual, emotion, and sentiment features. In this work, all critical steps were taken into consideration, from data preparation to deep learning models. There is a sparsity issue in cyberbullying datasets that encompasses all forms of cyberbullying, such as threatening, harassing, humiliating, intimidating, and manipulating or controlling targeted victims. To address this issue, this research utilized two datasets: the Toxic dataset, collected by the Conversation AI team, and the Twitter dataset. The dataset of cyberbullying generally faces an imbalance between its labels; therefore, sampling techniques were developed to reduce the imbalance ratio. After the datasets preparation, the next step in detecting cyberbullying was extracting textual features, such as syntactic, semantic, contextual, and emotion features. Nevertheless, emotion features were thoroughly investigated through the use of a lexiconbased deep learning model. To build an emotion detection model, the used emotion datasets were collected from twitter through hashtag keywords, and were categorized based on these keywords. Due to the potential inaccuracy of the hashtag labelling, a validation procedure was then carried out to authenticate the annotation of the emotion dataset labels. The validated dataset was then used to train the emotion detection model (EDM) using BERT as a pre-trained word representation model. This model was used to study and explore the emotions related to cyberbullying texts. The results indicate that 92% of cyberbullying emotions are categorized as negative. Emotions and sentiments were drawn out from cyberbullying datasets through the use of EDM and NRC lexicon for emotions and AFINN lexicon for sentiments. These features were fed to deep learning models to train cyberbullying detection models. A set of experiments were carried out to investigate the best set of features for cyberbullying detection. The findings indicate that incorporating emotions features can enhance the precision of detecting cyberbullying as this approach outperformed the use of BERT contextual features only. In the experiment that involved emotion features, the recall score was 0.87, which led to a 0.5 increase in the performance of cyberbullying detection compared to using only BERT. Similarly, incorporating sentiment features improved the model by 0.6 recall compared to only utilizing BERT.

Item Type: Thesis (Masters)
Additional Information: Call No.: BF576.2.E47 A44 2022
Uncontrolled Keywords: Emotion recognition
Subjects: B Philosophy. Psychology. Religion > BF Psychology (General) > BF1-990 Psychology
Divisions: Faculty of Computing and Informatics (FCI)
Depositing User: Ms Nurul Iqtiani Ahmad
Date Deposited: 11 Jan 2024 01:55
Last Modified: 11 Jan 2024 01:55
URII: http://shdl.mmu.edu.my/id/eprint/12041

Downloads

Downloads per month over past year

View ItemEdit (login required)