Punctuation Restoration with Transformer Model on Social Media Data

Citation

Bakare, Adebayo Mustapha and Sonai Muthu Anbananthen, Kalaiarasi and Muthaiyah, Saravanan and Krishnan, Jayakumar and Kannan, Subarmaniam (2023) Punctuation Restoration with Transformer Model on Social Media Data. Applied Sciences, 13 (3). p. 1685. ISSN 2076-3417

[img] Text
16.pdf - Published Version
Restricted to Repository staff only

Download (1MB)

Abstract

Several key challenges are faced during sentiment analysis. One major problem is determining the sentiment of complex sentences, paragraphs, and text documents. A paragraph with multiple parts might have multiple sentiment values. Predicting the overall sentiment value for this paragraph will not produce all the information necessary for businesses and brands. Therefore, a paragraph with multiple sentences should be separated into simple sentences. With a simple sentence, it will be effective to extract all the possible sentiments. Therefore, to split a paragraph, that paragraph must be properly punctuated. Most social media texts are improperly punctuated, so separating the sentences may be challenging. This study proposes a punctuation-restoration algorithm using the transformer model approach. We evaluated different Bidirectional Encoder Representations from Transformers (BERT) models for our transformer encoding, in addition to the neural network used for evaluation. Based on our evaluation, the RobertaLarge with the bidirectional long short-term memory (LSTM) provided the best accuracy of 97% and 90% for restoring the punctuation on Amazon and Telekom data, respectively. Other evaluation criteria like precision, recall, and F1-score are also used.

Item Type: Article
Uncontrolled Keywords: Punctuation restoration, transformers models, Bidirectional Encoder Representations from Transformers (BERT), long short-term memory (LSTM)
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800-8360 Electronics
Divisions: Faculty of Information Science and Technology (FIST)
Faculty of Management (FOM)
Depositing User: Ms Nurul Iqtiani Ahmad
Date Deposited: 02 Mar 2023 06:39
Last Modified: 27 Apr 2023 13:15
URII: http://shdl.mmu.edu.my/id/eprint/11187

Downloads

Downloads per month over past year

View ItemEdit (login required)