Citation
Hu, Wen Li and Zhang, Daniel (2024) Weighted Plain Bayes-Based English Text Classification in Multilingual Interactive Environments. Journal of Network Intelligence, 9 (4). pp. 1984-1999. ISSN 2414-8105
Text
04.JNI-S-2023-12-019.pdf - Published Version Restricted to Repository staff only Download (367kB) |
Abstract
In today’s world, the process of globalization is accelerating, and the communication between different countries and nationalities is becoming more and more frequent. In this background, this paper proposes a weighted plain Bayesian-based English text classification algorithm in a multilingual interactive environment (WTWNBA) to address the problems of low classification efficiency and long classification time of the current English text classification algorithm. Firstly, the orthogonal transformation method is used to eliminate the linear relationship between continuous attributes; the conditional probabilities of weighted discrete attributes and orthogonal transformed continuous attributes are differentiated and computed, so as to improve the generalization ability of the WNBA algorithm. Then, for the problem that the existing classification algorithms do not consider the influence of the location of words on the text, a weighted plain Bayesian-based English text classification algorithm is proposed for the multilingual interactive environment, which introduces interclass and intraclass discretization factors of the feature words and assigns different weights to different locations of the English words, which strengthens the ability of differentiating the information of feature words’ class distributions, and realizes the accurate classification of the English text. The experimental results show that the Accuracy, Precision, Recall and F1 values of ETWNBA are more than 90% on each dataset of the experiment, and when the number of texts is 100, the classification time is 210s, which has high classification efficiency and low classification time.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Text categorization; Weighted plain Bayes; Orthogonal transformations; Discrete attributes; Multilingual interaction |
Subjects: | Q Science > QA Mathematics > QA299.6-433 Analysis |
Divisions: | Faculty of Information Science and Technology (FIST) |
Depositing User: | Ms Nurul Iqtiani Ahmad |
Date Deposited: | 03 Jan 2025 06:06 |
Last Modified: | 03 Jan 2025 06:09 |
URII: | http://shdl.mmu.edu.my/id/eprint/13312 |
Downloads
Downloads per month over past year
Edit (login required) |