Comprehensive comparison of various machine learning algorithms for short-term ozone concentration prediction


Yafouz, Ayman and AlDahoul, Nouar and Birima, Ahmed H. and Ahmed, Ali Najah and Sherif, Mohsen and Sefelnasr, Ahmed and Allawi, Mohammed Falah and Elshafie, Ahmed (2022) Comprehensive comparison of various machine learning algorithms for short-term ozone concentration prediction. Alexandria Engineering Journal, 61 (6). pp. 4607-4622. ISSN 1110-0168

[img] Text
Comprehensive comparison of various machine learning....pdf
Restricted to Repository staff only

Download (2MB)


Ozone (O3) is one of the common air pollutants. An increase in the ozone concentration can adversely affect public health and the environment such as vegetation and crops. Therefore, atmospheric air quality monitoring systems were found to monitor and predict ozone concentration. Due to complex formation of ozone influenced by precursors of ozone (O3) and meteorological conditions, there is a need to examine and evaluate various machine learning (ML) models for ozone concentration prediction. This study aims to utilize various ML models including Linear Regression (LR), Tree Regression (TR), Support Vector Regression (SVR), Ensemble Regression (ER), Gaussian Process Regression (GPR) and Artificial Neural Networks Models (ANN) to predict tropospheric (O3) using ozone concentration dataset. The dataset was created by observing hourly average data from air quality monitoring systems in 3 different stations including Putrajaya, Kelang, and KL in 3 sites in Peninsular Malaysia. The prediction models have been trained on this dataset and validated by optimizing their hyperparameters. Additionally, the performance of models was evaluated in terms of RMSE, MAE, R2, and training time. The results indicated that LR, SVR, GPR and ANN were able to give the highest R2 (83 % and 89 %) with specific hyperparameters in stations Kelang and KL, respectively. On the other hand, SVR and ER outweigh other models in terms of R2 (79 %) in Putrajaya station. Overall, regardless slightly performance differences, several developed models were able to learn patterns well and provide good prediction performance in terms of R2, RMSE and MAE. Ensemble regression models were found to balance between high prediction accuracy in terms of R2 and low training time and thus considered as a feasible solution for application of Ozone concentration prediction using the data in hourly scenario.

Item Type: Article
Uncontrolled Keywords: Air quality, Ozone concentration prediction, Machine learning
Subjects: T Technology > TD Environmental technology. Sanitary engineering > TD878-894 Special types of environment Including soil pollution, air pollution, noise pollution
Divisions: Faculty of Engineering (FOE)
Depositing User: Ms Nurul Iqtiani Ahmad
Date Deposited: 02 Dec 2021 14:04
Last Modified: 27 Jan 2022 15:00


Downloads per month over past year

View ItemEdit (login required)