Citation
Hossen, Md. Sabbir and Farid, Fahim Al and Shaha, Pabon and Twake, Md. Mowahibur Rahman and Sabah, Fahjimatus and Rezwan, K. M. Mursalin Billah and Rahman, Anichur and Karim, Hezerul Abdul and Miah, Abu Saleh Musa (2025) A sophisticated feature vectorization-based stacked machine learning approach for fake news detection in Bangla and English. Social Network Analysis and Mining, 16 (25). ISSN 1869-5469|
Text
s13278-025-01552-6.pdf - Published Version Restricted to Repository staff only Download (7MB) |
Abstract
The rapid spread of fake news through social media and the internet poses a major challenge, especially in developing countries like Bangladesh. Fake news, consisting of misleading or fabricated content, can severely impact individuals, organizations, and society. Many existing detection methods are complex and resource-intensive, making them unsuitable for real-time applications or low-resource languages. To address this issue, we proposed an efficient fake news detection system using a stacked machine learning model with TF-IDF for feature extraction. TF-IDF converts text into a numerical representation by emphasizing key terms, while the stacked ensemble model improves classification accuracy by integrating multiple machine learning algorithms. We also explored various machine learning classifiers, as well as the mBERT and Word2Vec feature vectorization techniques. We evaluated our system using three datasets: one English and two Bangla fake news detection datasets. The TF-IDF + Stacking model achieved 99.6% accuracy and a 99.8% F1-score on the English dataset. For the first Bangla dataset, accuracy improved from 74.8 to 85.2% after applying SMOTE. On the second Bangla dataset, BanFakeNews, the model achieved 98.4% accuracy, demonstrating strong performance across languages. This research introduces a lightweight yet highly effective machine learning-based fake news detection system, making it suitable for real-world applications, especially in multilingual and resource-limited settings.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | Machine learning, Fake news, Social media, Natural language processing, Feature extraction, Ensemble method, Deep learning |
| Subjects: | Q Science > QA Mathematics > QA71-90 Instruments and machines |
| Divisions: | Faculty of Artificial Intelligence & Engineering (FAIE) |
| Depositing User: | Ms Suzilawati Abu Samah |
| Date Deposited: | 09 Feb 2026 06:23 |
| Last Modified: | 09 Feb 2026 06:25 |
| URII: | http://shdl.mmu.edu.my/id/eprint/15239 |
Downloads
Downloads per month over past year
Edit (login required) |
