Citation
Lokman, Amar and Wan Ismail, Wan Zakiah and Ab Aziz, Nor Azlina (2026) Enhancing water quality index prediction accuracy in Mranti lake and rivers in Malaysia using regression forest model. Applied Water Science, 16 (2). ISSN 2190-5487|
Text
s13201-025-02705-w.pdf - Published Version Restricted to Repository staff only Download (8MB) |
Abstract
The economic and environmental commodity of water quality has a substantial impact on the welfare of the public and the viability of ecosystem of a country. This research aims to improve the accuracy of water quality index (WQI) prediction by using machine learning techniques based on Malaysian standard. The study is carried out using a historical dataset that includes 11,065 samples from different places by the Malaysian Department of Environment (DOE). The data contains several important water quality parameters, including pH, dissolved oxygen, total suspended solids, biological oxygen demand, chemical oxygen demand, and ammoniacal nitrogen concentration. We develop a novel model, time series regression forest (TSRF) that is based on regression and random forest theory to enhance the WQI prediction. TSRF is compared with other machine learning models: decision trees (DT), ridge regression (RR), artificial neural networks (ANN), extra tree regressor (ETR), random forest (RF), autoregressive integrated moving average (ARIMA), DT-ARIMA and RFARIMA models. TSRF achieves the lowest mean square error (MSE) of 1.178 and the highest R² of 0.968, surpassing DT–ARIMA (MSE=2.526, R2=0.932) and RF–ARIMA (MSE=2.907, R2=0.922). The comparison with deep learning approaches shows that long short term memory (LSTM) achieves MSE=9.649 and R² = 0.879, followed closely by convolutional neural network (CNN)-LSTM (MSE=13.224, R² = 0.834). In contrast, gated recurrent unit (GRU) (R² = 0.781), Transformer (R² = 0.781), and Wavelet-ARIMA (R² = 0.676) exhibit higher errors and weaker correlations, highlighting the relative advantage of temporal deep learning models for dynamic water quality forecasting. These results indicate that TSRF can improve the reliability and efficiency of WQI prediction, enabling policymakers and environmental managers to make informed, data-driven decisions and implement targeted pollution control strategies. Thus, this study can assist us to have better water quality management strategies by increasing the reliability and efficiency of WQI prediction. Water quality assessment methods developed in this research can help policymakers and environmental stewards to make informed decisions and implement targeted pollution control strategies. Advancing machine learning applications in environmental science strengthens current practices in water quality management and enhances decision-making capabilities.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | Machine learning , environmental monitoring |
| Subjects: | Q Science > Q Science (General) > Q300-390 Cybernetics T Technology > TD Environmental technology. Sanitary engineering |
| Divisions: | Faculty of Engineering and Technology (FET) |
| Depositing User: | Ms Rosnani Abd Wahab |
| Date Deposited: | 10 Feb 2026 03:30 |
| Last Modified: | 10 Feb 2026 03:30 |
| URII: | http://shdl.mmu.edu.my/id/eprint/15284 |
Downloads
Downloads per month over past year
Edit (login required) |
