Skip to Main Content
Article navigation
Purpose

Delays in water construction projects trigger severe financial losses and societal setbacks. This study pioneers a cutting-edge stacking ensemble machine learning model to predict delay severity with unprecedented precision, empowering project managers to mitigate risks and drive sustainable infrastructure development.

Design/methodology/approach

Leveraging a robust literature review and 439 real water project contracts, five critical features – project duration, cost, climate zone, change costs, and adjustment costs – were meticulously selected. Data underwent rigorous preprocessing (standardization, Elliptic Envelope outlier detection) using scikit-learn. Four base learners (ANN, Decision Tree, Random Forest, KNN) were optimized via grid search, integrated into a stacking model with Random Forest as the meta-learner, and validated through repeated stratified 5-fold cross-validation.

Findings

The stacking model achieves remarkable performance (Accuracy: 0.957, F1-score: 0.957, Kappa: 0.935), outperforming individual algorithms by up to 5.5% and surpassing prior benchmarks. It excels in critical delay classes (4.4% error for 30–60%), enabling precise risk prediction and resource optimization.

Originality/value

This study revolutionizes delay forecasting by applying stacking ensemble learning to water projects for the first time, using real contract data to eliminate bias and overfitting. It delivers a transformative framework for proactive planning, cost-efficient buffering, and resilient project delivery, redefining construction management.

Licensed re-use rights only
You do not currently have access to this content.
Don't already have an account? Register

Purchased this content as a guest? Enter your email address to restore access.

Please enter valid email address.
Email address must be 94 characters or fewer.
Pay-Per-View Access
$41.00
Rental

or Create an Account

Close Modal
Close Modal