Term Standardisation With LDA Model To Detect Service Disruption Events Using English And Manglish Tweets

Citation

Yusuf, Noraysha and Ismail, Maizatul Akmar and Zayet, Tasnim M.A. and Varathan, Kasturi Dewi and MD Noor, Rafidah (2024) Term Standardisation With LDA Model To Detect Service Disruption Events Using English And Manglish Tweets. Journal of Informatics and Web Engineering, 3 (1). pp. 1-14. ISSN 2821-370X

[img] Text
622-Article Text-5241-3-10-20240216.pdf - Published Version
Restricted to Repository staff only

Download (884kB)

Abstract

Rapid transit is one of Malaysia's most important transportation modes, where commuters use public transportation to travel. Any disruption in the rapid transit service affects their daily routines. Therefore, detecting such service disruption has become fundamental. In this study, the disruption in Malaysia's rapid transit service was assessed using English and Manglish (a combination of English and Malay) tweets through Latent Dirichlet Allocation (LDA). The gathered tweets were classified into event and non-event tweets and LDA was applied to the event tweets. Manglish event tweets were pre-processed using the proposed term standardisation technique. As a result, LDA has proved its efficiency in topic detection for both English and Manglish tweets with better performance for Manglish tweets; The best event detection rate of the LDA_English model was at the likelihood of 80% while the best detection rate of the LDA_Manglish model was at a likelihood of 60%.

Item Type: Article
Uncontrolled Keywords: Rapid Transit, LDA, Manglish, Multilingual, Twitter
Subjects: P Language and Literature > P Philology. Linguistics
Divisions: Others
Depositing User: Mr. MUHAMMAD AZRUL MOSRI
Date Deposited: 01 Apr 2024 23:58
Last Modified: 01 Apr 2024 23:58
URII: http://shdl.mmu.edu.my/id/eprint/12239

Downloads

Downloads per month over past year

View ItemEdit (login required)