A Reinforcement Learning Guided Oppositional Mountain Gazelle Optimizer for Time–Cost–Risk Trade-Off Optimization Problems

Citation

Eirgash, Mohammad Azim and Tiang, Jun Jiat and Ateş, Bayram and Sharma, Abhishek and Lim, Wei Hong (2025) A Reinforcement Learning Guided Oppositional Mountain Gazelle Optimizer for Time–Cost–Risk Trade-Off Optimization Problems. Buildings, 16 (1). p. 144. ISSN 2075-5309

[img] Text
buildings-16-00144-v2.pdf - Published Version
Restricted to Repository staff only

Download (3MB)

Abstract

Existing metaheuristic approaches often struggle to maintain an effective exploration–exploitation balance and are prone to premature convergence when addressing highly conflicting time–cost–safety–risk trade-off problems (TCSRTPs) under complex construction project constraints, which can adversely affect project productivity, safety, and the provision of decent jobs in the construction sector. To overcome these limitations, this study introduces a hybrid metaheuristic called the Q-Learning Inspired Mountain Gazelle Optimizer (QL-MGO) for solving multi-objective TCSRTPs in construction project management, supporting the delivery of resilient infrastructure and resilient building projects. QL-MGO enhances the original MGO by integrating Q-learning with an opposition-based learning strategy to improve the balance between exploration and exploitation while reducing computational effort and enhancing resource efficiency in construction scheduling. Each gazelle functions as an adaptive agent that learns effective search behaviors through a state–action–reward structure, thereby strengthening convergence stability and preserving solution diversity. A dynamic switching mechanism represents the core innovation of the proposed approach, enabling Q-learning to determine when opposition-based learning should be applied based on the performance history of the search process. The performance of QL-MGO is evaluated using 18- and 37-activity construction scheduling problems and compared with NDSII-MGO, NDSII-Jaya, NDSII-TLBO, the multi-objective genetic algorithm (MOGA), and NDSII-Rao-2. The results demonstrate that QL-MGO consistently generates superior Pareto fronts. For the 18-activity project, QL-MGO achieves the highest hypervolume (HV) value of 0.945 with a spread of 0.821, outperforming NDSII-Rao-2, MOGA, and NDSII-MGO. Similar results are observed for the 37-activity project, where QL-MGO attains the highest HV of 0.899 with a spread of 0.674, exceeding the performance of NDSII-Jaya, NDSII-TLBO, and NDSII-MGO. Overall, the integration of Q-learning significantly enhances the search capability of MGO, resulting in faster convergence, improved solution diversity, and more reliable multi-objective trade-off solutions. QL-MGO therefore serves as an effective and computationally efficient decision-support tool for construction scheduling that promotes safer, more reliable, and resource-efficient project delivery.

Item Type: Article
Uncontrolled Keywords: time–cost–risk trade-off problems, Mountain Gazelle Optimizer, reinforcement learning, opposition-based learning, pareto-front solutions, construction optimization
Subjects: T Technology > TH Building construction > TH5011-5701 Construction by phase of the work (Building trades)
Divisions: Faculty of Artificial Intelligence & Engineering (FAIE)
Depositing User: Ms Suzilawati Abu Samah
Date Deposited: 09 Feb 2026 04:45
Last Modified: 09 Feb 2026 04:45
URII: http://shdl.mmu.edu.my/id/eprint/15236

Downloads

Downloads per month over past year

View ItemEdit (login required)