Towards a Translation Framework for ‘rojak’ Language: Challenges and Early Findings

Citation

Tan, Jia Hui and Goh, Pey Yun and Tan, Shing Chiang and Chong, Lee Ying (2026) Towards a Translation Framework for ‘rojak’ Language: Challenges and Early Findings. In: 7th International Conference on AI in Computational Linguistics, ACLing 2025, 6 December 2025 - 7 December 2025, Hybrid, Dubai.

[img] Text
23.pdf - Published Version
Restricted to Repository staff only

Download (584kB)

Abstract

The ‘rojak’ language, a complex linguistic blend of Malay, English, Mandarin, and local dialects prevalent in Malaysia and Singapore, presents significant translation challenges due to its informal expressions, frequent code-switching, and cultural idioms. Although LLMs have shown growing capability in multilingual translation but there is limited works to evaluate the performance of LLMs under ‘rojak’ context. Furthermore, bilingual datasets are common but not ‘rojak’ dataset. This study tries to reduce the research gap by 1) proposing a ‘rojak’ dataset that capture informal slang, abbreviations and emoticons, and 2) providing early finding on how prompt engineering and fine-tuning strategies affect LLMs, particularly LLaMA model in translation from ‘rojak’ language to English. Evaluation was performed using metrics, including BLUE, BERTScore, METEOR, TER and COMET. the comparable performance of the fine-tuned LLaMA 3 8B highlights that parameter-efficient adaptation of open models can still yield competitive quality for low-resource, hybrid languages, demonstrating the potential of targeted fine-tuning in multilingual translation research.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Large language model, code-switching, ‘rojak’ language
Subjects: Q Science > QA Mathematics > QA71-90 Instruments and machines
Divisions: Faculty of Information Science and Technology (FIST)
Depositing User: Ms Rosnani Abd Wahab
Date Deposited: 02 Jul 2026 01:42
Last Modified: 02 Jul 2026 01:42
URII: http://shdl.mmu.edu.my/id/eprint/16195

Downloads

Downloads per month over past year

View ItemEdit (login required)