Sentiment analysis by fusing text and location features of geo-tagged tweets

Citation

Lim, Wei Lun (2021) Sentiment analysis by fusing text and location features of geo-tagged tweets. Masters thesis, Multimedia University.

Full text not available from this repository.
Official URL: http://erep.mmu.edu.my/

Abstract

Sentiment analysis analyses text input and determines whether the sentiment is negative, neutral, or positive. Sentiment analysis is vital for social development. An organization can use the feedback of product reviews or the comments about a particular service to improve the quality of life of the communities they serve. People also like to share their opinions with others, even with total strangers. Social media is a common platform that people use to post their thoughts. They do so to get positive feedback from social media through the likes and follows from other people. These rewards do boost their self-esteem, or they do so just for social interaction. Researchers have also been working on text sentiment analysis to study people’s emotions to get insights for better decision-making. Twitter sentiment analysis provides valuable feedback from public emotion on events or products that are related to them. Current Twitter sentiment research has been focused on obtaining sentiment features from vectorized lexical and syntactic features from tweets without considering additional context from other attributes of a tweet. Location is an important factor that has been neglected as a factor that affects people’s emotions. Sometimes, it is not the products that make people want to complain. It is, rather, the combination of the bad experiences experienced at that location that is making them feel uncomfortable. With that, people then decided to express their unpleasant feeling using social media. For example, customers may complain about the environment of a restaurant is not pleasant to dine in because they are disturbed by the noise from the car repair shop nearby. This work investigated how vectorized location information could be combined with word embeddings to produce a hybrid representation, which has resulted in an improvement on a tweet sentiment classification task. The location information of the geo-tagged tweets provided further context, which was useful for improving a sentiment classification task. The tweets investigated contained a set of geo-tagged tweets. The word embeddings of these tweets were combined with the geo-tagged tweets’ vectorized location features or ego network measurements to form a sentiment feature set of geo-tagged tweets. The sentiment feature set was incorporated into a Convolutional Neural Network (CNN), a Bidirectional Recurrent Neural Network (BRNN), a Bidirectional Long Short-Term Memory Network (BiLSTM), and a Transformer for the tasks of training and predicting sentiment classification labels. The performance of this hybrid representation is compared with the performance of the baseline model built using a GloVe model. The results of the experiments have shown that the incorporation of vectorized location information has resulted in the improvement of the accuracy for a twitter sentiment classification task performed using CNN and BRNN for the binary classification task, while the usage of CNN has resulted in improvement of accuracy in the multiclass classification task. The Transformer is showing inconsistent results in both binary and multiclass classification tasks, indicating that the feature fusion is not suitable to use with Transformer. To investigate how location information added to text affect the model performance, SHapley Additive exPlanations (SHAP) is used to check the feature importance in CNN. The reason for choosing CNN is that it shows improved accuracy in both binary and multiclass classification tasks. The value generated from SHAP showed that location information affects the model by changing the order of feature importance in the model.

Item Type: Thesis (Masters)
Additional Information: Call No.: QA76.9.S57 L56 2021
Uncontrolled Keywords: Sentiment analysis
Subjects: Q Science > QA Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science > QA76.75-76.765 Computer software
Divisions: Faculty of Computing and Informatics (FCI)
Depositing User: Ms Nurul Iqtiani Ahmad
Date Deposited: 16 Jan 2023 07:05
Last Modified: 17 Apr 2023 07:04
URII: http://shdl.mmu.edu.my/id/eprint/11101

Downloads

Downloads per month over past year

View ItemEdit (login required)