The purpose of this study is to predict the compressive strength of hybrid-fiber-reinforced recycled aggregate concrete (HFRRAC) using machine learning (ML) models. The aim is to address the challenges of optimizing HFRRAC mix designs by modeling the complex and non-linear relationships between mixing ingredients and compressive strength.
A data set comprising 634 samples of HFRRAC mix proportions and corresponding compressive strength values was compiled from recent research studies. A total of 11 ML models were used for compressive strength prediction, and the models’ performance was evaluated by using metrics such as R², mean squared error(MSE), mean absolute error (MAE) and root MSE (RMSE). Feature importance analysis was also performed to identify key predictors.
Light gradient boosting machine demonstrated the highest accuracy with an R² of 0.969, an MSE of 14.04, an MAE of 2.76 and an RMSE of 3.75, making it the most efficient model for predicting HFRRAC compressive strength. CatBoost was identified as the second-best performer. Feature importance analysis highlighted water-to-binder ratio, coarse aggregate-to-binder ratio, fine aggregate-to-binder ratio, recycled aggregate replacement rate and curing age as critical factors influencing compressive strength.
This study demonstrates the application of ML techniques to predict HFRRAC compressive strength with high accuracy, offering a reliable approach to mix design optimization.
1. Introduction
As economic development progresses and urbanization accelerates, the construction of new buildings is on the rise, leading to the generation of significant amounts of construction and demolition waste (CDW). This waste was commonly managed through simplistic and cost-effective methods such as open dumping and landfilling, resulting in severe environmental ramifications (Zari, 2024). The disposal of CDW poses difficulties in waste management, as landfills take up valuable land area and pose potential hazards of soil and groundwater pollution (Chen et al., 2021; Osra et al., 2024). Conversely, recycled aggregate (RA) produced from CDW provides a sustainable alternative. This approach decreases the need for new natural resources and minimizes environmental impact by reducing greenhouse gas emissions, lowering energy consumption associated with extracting virgin aggregates and preventing land contamination from CDW (Silva et al., 2019; Atasham ul haq et al., 2024). As an economical alternative to natural aggregates (NA), RA has the potential to replace traditional crushed stone and gravel partially.
One of the primary concerns of RA-based concrete is the substantial mechanical strength reduction, which requires a particular treatment procedure to increase its strength (Li et al., 2024; Wu et al., 2024; Xi et al., 2024). However, various treatment methods have been suggested by researchers over the decades to improve the mechanical strength and durability concerns of recycled aggregate concrete (RAC), such as chemical treatment, polymer emulsion, pozzolanic slurry, calcium carbonate bio-deposition, sodium silicate solution and the addition of different fibers in RAC (Meesala, 2019; Mistri et al., 2020; Wang et al., 2021; Rebai et al., 2024). However, most of the methods have some drawbacks. Chemical treatment such as presoaking in acid solution is not economically viable (Tam et al., 2007), polymer emulsion pollutes the environment (Spaeth and Djerbi Tegguer, 2013; Kaushik and Bhan, 2024), pozzolanic slurry can reduce the fluidity (Shi et al., 2018), calcium carbonate bio-deposition increases the number of small pores (Grabiec et al., 2012; Qiu et al., 2014; Shi et al., 2016) and sodium silicate solution cannot improve water penetration and may lead to alkali-silica reactions in concrete (Shi et al., 2016; Pan et al., 2017). On the other hand, adding different fibers with RA in recycled concrete can significantly improve the performance of RAC when encountering environmental and cost constraints. Several research studies have shown that fiber reinforcement with concrete can improve creep behavior and splitting tensile and flexural strength (Lee and Choi, 2013; Lyu et al., 2019). Additionally, under various loading circumstances, fiber can improve the concrete’s ductility and toughness.
Kang et al. (2017) found that the flexural strength of RAC beams with 100% RA and 0.15% steel fiber became comparable to natural aggregate concrete (NAC) beams. The findings of Carneiro et al. (2014) showed that the tensile strength of RAC increased by 26% for the incorporation of 0.75% of volumetric doses of steel fiber. Akça et al. (2015) also concluded that optimal performance could be achievable for RAC when incorporating 1% of polypropylene fiber. An experimental study by Ali et al. (2022) reported a 7% increment of compressive strength of RAC when RAC was produced with 0.1% doses of nylon fiber. Moreover, synthetic fibers such as polypropylene and nylon enhance concrete but raise ecological concerns, while natural alternatives such as alfa fibers offer sustainable improvements in tensile strength (up to 54.41%) with less environmental impact (Mamen et al., 2022). It is also considered that the combination of two or more fibers, or a hybrid fiber, can be more effective than a single fiber. He et al. (2020) found that the hybrid fibers consisting of steel and polypropylene showed better performance, significantly improving the compressive and flexural ductility of RAC in comparison to only steel fiber incorporated in RAC. An experimental study by Cui et al. (2023) showed that a multiscale hybrid fiber could alter the failure patterns and increase RAC’s compressive strength. In addition, the ability of fiber reinforcement in RAC is not only limited to enhancing the mechanical strength; it can also decrease the probability of concrete cracking by restraining shrinkage and expanding thermal stability (Asim et al., 2020; Nassif et al., 2022). Several studies suggested that the fiber reinforcement could also minimize concrete permeability and increase freeze-thaw resistance (Dong et al., 2022; Han et al., 2022).
Though the integration of fiber into concrete was initially intended to improve cracking behavior by increasing tensile and flexural strength (Abbas and Iqbal Khan, 2016; Pakravan et al., 2017), it also demonstrated the enhancement of compressive strength (CS). Furthermore, the relationship of CS with the fiber dosages and RA replacement rate in fiber-reinforced recycled aggregate concrete (FRRAC) is nonlinear (Zhang et al., 2021; Zaid et al., 2022), which eventually results in complexities in predicting CS, especially for hybrid fibers. However, it was crucial to establish an efficient analytical method for predicting CS of FRRAC since the mechanical testing method is prevalent but expensive, laborious, resource-intensive and time-consuming. In past decades, the researcher proposed several empirical methods for predicting CS. Still, these approaches were designed for typical concrete and fail to generalize to other types.
In contrast, data-driven strategies such as artificial intelligence (AI) or machine learning (ML) for predicting CS have gained much attention from researchers in recent years. Data-driven ML models can efficiently predict CS apart from the heterogeneous nature and nonlinear relationship among the mixing ingredients of concrete, outperforming traditional empirical methods (Chen et al., 2021; Pakzad et al., 2023; Pal et al., 2023; Alabduljabbar et al., 2024) with high generalization capabilities. Popular ML models frequently used for predicting CS include support vector machine (SVM), k-nearest neighbor (KNN), decision tree (DT), adaptive boosting (AB), random forest (RF), extreme gradient boosting (XGB), categorical boosting (CB), light gradient boosting machine (LGBM), artificial neural network (ANN), convolution neural network (CNN). For instance, Ekanayake et al. (Ekanayake et al., 2022) employed tree as well as gradient boosting and Laplacian Kernel Regression-based ML algorithm to predict CS of concrete and found XGB as the most efficient model. However, researchers have overlooked CS prediction for hybrid FRRAC (HFRRAC). Existing data-driven ML models of CS prediction for HFRRAC are limited to single fiber-based and focused only on simple input parameters. For instance, one previous study by Pakzad et al. (Pakzad et al., 2023) used a CNN to predict the CS of steel FRRAC having score of 0.92. Shafighfard et al. (2022) found the stacked ML model efficient for steel fiber-reinforced concrete. Additionally, Li et al. (2022) found that RF was a good fit for predicting the properties of basalt fiber-based concrete. Furthermore, Kang et al. (2021) reported that XGB had a low RMSE score of 3.61 for steel fiber-reinforced concrete. These studies developed models for single-fiber-based concrete only. However, there is a notable research gap regarding the combined impact of different fibers on the CS of recycled aggregate concrete.
This study has attempted to establish an efficient computational model for capturing the nonlinear relationship between CS and the mixing ingredients of HFRRAC. This study used ensemble models such as RF, AdaBoost, CatBoost, XGBoost, LGBM along with tree-based models such as DT and simpler models such as Linear, Ridge, Laaso regression, SVM and KNN. By evaluating the statistical performance of these models, this study proposed the best-performing model and the top contributing factor. Thus, the key objective of this study is to suggest the most efficient data-driven prediction model for the CS of HFRRAC by comparing the statistical performance of different ML algorithms.
1.1 Background of prediction models
The conventional destructive method for testing concrete’s CS is expensive, time-consuming and typically requires an extended curing period. In that approach, a concrete specimen, such as a cylinder or cube, is subjected to uniformly distributed axial compression until it fails. Conversely, researchers have long sought a faster and more cost-effective method to predict the CS of concrete. To estimate CS, various methods have been used, including computational, numerical and analytical models (Nithurshan and Elakneswaran, 2023).
Early attempts to predict CS were made to build a correlation between mixing ingredients, such as water-to-cement ratio and CS. Feret (1892) established a quantitative relationship between the CS of concrete and its paste composition. His model emphasized the ratios of key components in concrete, particularly the volumetric percentage of cement, water and air. He proposed that the CS of concrete primarily depends on the volume fractions of cement paste and air. This approach aimed to capture the influence of the relative proportions of these components on the overall strength. However, this model showed poor agreement with experimental data, indicating that the theoretical relationship did not accurately predict the actual strength of concrete mixes and the proportionality coefficients of this model had to be ascertained for every distinct trial mixture, which was laborious and time-consuming and made the model impractical (Popovics, 1985). Unlike Feret’s model, Abrams (1919) introduced a simpler and more consistent model with a water-to-cement ratio. He suggested that the water-to-cement ratio and the CS of concrete are inversely proportional, meaning that when the ratio decreases, the concrete’s CS increases and vice versa. His model was widely accepted for its straightforward water-to-cement ratio and CS relationship and exhibited significant compatibility with experimental results, particularly for concrete that is not air-entrained. However, the proportionality coefficients in the model, similar to those in Feret’s model, were not universal constants. Instead, they varied depending on factors such as the type of cement used, the concrete’s strength, the testing and curing conditions and the age of the concrete. Without adjustments, this variability restricts the model’s generalizability. However, De Larrard (1999) incorporated aggregates’ properties into the concrete strength prediction. He built his model upon earlier models such as Feret’s and Abrams’ by addressing their limitations and adding new dimensions to concrete strength prediction. De Larrard integrated aggregate properties, such as maximum size and packing density, into a concrete strength model and developed coefficients for different aggregates. This model also included maximum paste thickness (MPT), the distance between aggregates filled by cement pastes in the model, to better understand paste distribution and component interaction within the concrete mix. However, a significant limitation of the model was that it failed to consider the degree of hydration or the chemical and physical characteristics of cement, where both are crucial factors influencing concrete strength. Additionally, the validation of De Larrard’s model was based on a limited data set, which might reduce the reliability and generalizability of the predictions. Afterward, Chidiac et al. (2013), on the other hand, proposed a model integrating the excess paste theory and average paste thickness (APT) along with multiple sub-models to provide a comprehensive approach to predict concrete CS. It unified factors such as cement hydration, bond strength and aggregate gradation, offering realistic and practical strength predictions. However, the model’s validation was limited to a small sample size and a constant range of water-cement ratios, potentially reducing its applicability to high-strength concrete and other conditions.
However, traditional analytical models have laid the foundation for understanding concrete strength; their limitations necessitate the development of advanced prediction models. These advanced models, driven by data and ML, have the potential for greater accuracy, adaptability and efficiency in predicting and optimizing concrete performance. In recent years, ML-based prediction models have gained mass popularity among the scientific community (Meddage et al., 2022; Meddage et al., 2024). Researchers have been implementing advanced ML models such as SVM, KNN, DT, RF, AdaBoost, CatBoost, XGBoost, LightGBM and Neural Networks for predicting concrete CS for their high prediction accuracy and generalization capabilities. Unlike traditional analytical models, data-driven ML-based predictions can predict CS from large-scale data sets with complex mix designs for special types of concretes such as fiber reinforced, recycled aggregate and self-consolidating concrete (Nithurshan and Elakneswaran, 2023). Several researchers have developed data-driven ML models to predict different kinds of concrete over the years. For instance, Yu et al. (2018) used SVM for high-performance concrete (HPC) with 1,761 data points. He found that SVM can better predict the CS of HPC with advanced mix constituents such as blast furnace slag, fly ash and superplasticizer. Chopra et al. (2018) used RF, DT and neural network (NN) model for CS prediction of plain concrete in different aging conditions. They concluded that NN was the most feasible model, having higher accuracy. Feng et al. (2020) took quantitative amounts of aggregate, cement, water and additives into account as input parameters and found that the AB approach was superior to NN and SVM (Feng et al., 2020). Unlike plain concrete, it has been demonstrated that recycled aggregate concrete (RAC), fiber-based concrete or other special purpose concrete inherit more heterogenicity and require complex mix design (Nithurshan and Elakneswaran, 2023). These complex mix designs and heterogeneity could be the reason for multicollinearity (two or more input variables are highly correlated) among input parameters, leading to more complexities in prediction. But in recent days, few researchers have tried and successfully overcome these complexities and developed several prediction models accounting for RAC and fiber-based concrete. Deng et al. (2018) used deep learning to develop a SoftMax regression model using recycled aggregate replacement ratio, water-to-cement ratio and fly ash content as input parameters. They concluded that their model demonstrated high accuracy and strong generalization capacity. A study of Meddage et al. (2024) reported higher accuracy of XGB model with 0.981 of score for concrete with graphene oxide. Sabău and Remolina Duran (2022) used multiple linear regression and found a good correlation for predicting CS of recycled aggregate concrete with different aging conditions. Dabiri et al. (2022a) developed linear, nonlinear and RF regression models and compared them with ANN model while he found RF as the most accurate model. Based on the findings of his model, he deduced that incorporating RAC causes concrete’s CS to be decreased.
However, it has been demonstrated that incorporating fiber with plain as well as RAC can improve crack patterns, flexural toughness and CS. A few researchers have also developed several models for fiber-based concrete. For instance, Chen et al. (2021) found the CNN as an accurate flexible model to predict the CS of fiber-reinforced concrete at elevated temperatures and proposed this model as an appropriate mix design optimization tool. Huang et al. (2021) applied CNN to predict the CS of basalt FRRAC, while the model exhibited a maximum 3% residual error distribution. Pal et al. (2023) applied different ML models on fiber-reinforced concrete containing waste rubber and recycled aggregate and they found CatBoost to be the most efficient model.
In sum, most of the existing studies developed efficient models for predicting the CS of advanced types of concrete such as RAC, fiber-reinforced concrete and FRRAC. However, most of these models were developed using a single type of fiber and lacked essential input parameters, such as the water absorption of recycled aggregate. Therefore, this study comprehensively evaluated several predictive models for HFRRAC, incorporating critical input parameters such as recycled aggregate and fiber dosage, fiber types, water absorption of recycled aggregate, water-to-binder ratio and aggregate proportions to identify the most efficient model for optimizing mix design.
2. Materials and method
2.1 Dataset development
A total of 634 concrete CS test results were retrieved from 37 peer-reviewed experimental studies (Gao et al., 1997; Nataraja et al., 1999; Thomas and Ramaswamy, 2007; Bhikshma and Manipal, 2012; Chen et al., 2014; Akça et al., 2015; Zakaria et al., 2015; Mohammad et al., 2016; Senaratne et al., 2016; Afroughsabet et al., 2017; Abbass et al., 2018; Das et al., 2018; Hanumesh et al., 2018; Islam and Ahmed, 2018; Lima et al., 2018, 2019; Aslani et al., 2019; Lee, 2019; Matar and Assaad, 2019; Matar and Zéhil, 2019; Nitesh et al., 2019; Ramesh et al., 2019; Ahmed et al., 2020; He et al., 2020; Zhang et al., 2020; Zhou et al., 2020; Ahmad et al., 2021; Bheel et al., 2021; Gao and Wang, 2021; Gao et al., 2021; Kachouh et al., 2021; Ren et al., 2021; Ali et al., 2022; El Ouni et al., 2022; Zhang et al., 2022a, 2022b; Vijayan et al., 2022; Gong et al., 2023). The CS data involve different concrete mixes with important characteristics such as water to binder ratio (w/b), curing days, recycled aggregate replacement rate (RA), percentage of fiber, fiber type, coarse aggregate to binder ratio (CA/b), fine aggregate to binder ratio (FA/b) and water absorption rate of recycled aggregate as input and compressive strength (CS) as the desired output feature. The data set comprises a total of five distinct types of fiber, namely, steel, polypropylene, jute, nylon and sisal. Additionally, two types of composite fibers were also included; they were steel and polypropylene and nylon and jute. While there were some supplementary features in the data set development phase, only highly influential features with sufficient experimental data were taken into account. Owing to insufficient matching features and inadequate information, several test results were discarded. For repeated tests, the average value of CS was taken into account. The data retrieval procedure faced notable obstacles, particularly in dealing with missing data. To encounter these challenges, the missing data were imputed using the combination of the Gaussian mixture model (GMM) and KNN imputation method. This method first used GMM imputation to estimate missing values, which were then refined by KNN imputation.
2.1.1 Statistical analysis of the data set
A statistical summary of the parameters in the data set is provided in Table 1. The water-to-binder ratio (w/b) varies from 0.2 to 0.66, with a mean value of 0.42 and a standard deviation of 0.1. Notably, 75% of the observations for w/b are 0.5 or lower. The curing age of the concrete ranges from a minimum of 7 days to a maximum of 365 days. However, 75% of the test observations used a curing period of 28 days, resulting in an average curing age of 32.62 days. Table 1 also indicates that the average recycled aggregate replacement rate is 38.84%. The replacement rate ranges from 0%, indicating the exclusive use of NA, to 100%, indicating a complete substitution of NA with RA. The standard deviation for the RA replacement rate is 43.64.
Statistical description of features in HFRRAC mix
| Parameters | W/b | Curing | RA (%) | Fiber (%) | CA/b | FA/b | WA (%) | CS (MPa) |
|---|---|---|---|---|---|---|---|---|
| Count | 634 | 634 | 634 | 634 | 634 | 634 | 634 | 634 |
| Mean | 0.42 | 32.62 | 38.84 | 0.66 | 2.16 | 1.65 | 2.05 | 44.42 |
| SD | 0.1 | 35.54 | 43.64 | 0.76 | 0.68 | 0.69 | 2.79 | 22.05 |
| Min | 0.2 | 7 | 0 | 0 | 0 | 0.29 | 0 | 11 |
| Max | 0.66 | 365 | 100 | 6 | 4.18 | 4 | 10.29 | 132 |
| 25th percentile | 0.35 | 28 | 0 | 0 | 1.7 | 1.3 | 0 | 27 |
| 50th percentile | 0.42 | 28 | 0 | 0.5 | 2.22 | 1.62 | 0 | 38.75 |
| 75th percentile | 0.5 | 28 | 100 | 1 | 2.6 | 1.76 | 4.83 | 59.75 |
| Parameters | W/b | Curing | Fiber (%) | CA/b | FA/b | |||
|---|---|---|---|---|---|---|---|---|
| Count | 634 | 634 | 634 | 634 | 634 | 634 | 634 | 634 |
| Mean | 0.42 | 32.62 | 38.84 | 0.66 | 2.16 | 1.65 | 2.05 | 44.42 |
| 0.1 | 35.54 | 43.64 | 0.76 | 0.68 | 0.69 | 2.79 | 22.05 | |
| Min | 0.2 | 7 | 0 | 0 | 0 | 0.29 | 0 | 11 |
| Max | 0.66 | 365 | 100 | 6 | 4.18 | 4 | 10.29 | 132 |
| 25th percentile | 0.35 | 28 | 0 | 0 | 1.7 | 1.3 | 0 | 27 |
| 50th percentile | 0.42 | 28 | 0 | 0.5 | 2.22 | 1.62 | 0 | 38.75 |
| 75th percentile | 0.5 | 28 | 100 | 1 | 2.6 | 1.76 | 4.83 | 59.75 |
In a similar vein, it is observed that 25% of the test observations lack any form of fiber reinforcement. However, on average, each observation exhibits a fiber content of 0.66%, with a standard deviation of 0.76. The data set encompasses a mean ratio of 1.65:2.16 from fine aggregate to coarse aggregate, with a standard deviation of 0.68 for the coarse aggregate to binder ratio and 0.69 for the fine aggregate to binder ratio. Again, 75% of the water absorption rate of recycled aggregate data was below 4.83 with a standard deviation of 2.79. The designated output, CS, exhibits a minimum value of 11 MPa and a maximum of 132 MPa. The mean CS value is 44.42 MPa, with 75% of the values falling below 59.75 MPa.
The distribution of CS (MPa) with the input features has been depicted in Figure 1. The output feature, CS (MPa), can be seen as a function of the input features and the range of CS (MPa) variation for each input feature value is also presented.
The figure consists of eight scatter plots examining the relationship between compressive strength in megapascals and various parameters for five fiber categories: no fiber, steel, polypropylene, other fibers, and jute. The first plot shows compressive strength decreasing with increasing water to binder ratio. The second plot displays varying strengths against curing time, with most data concentrated at shorter durations. The third shows a scattered distribution with recycled aggregate percentage. The fourth presents fiber percentage influence, with scattered trends across types. The fifth compares fiber types directly, showing distinct vertical groupings. The sixth relates coarse aggregate to binder ratio to strength, showing no clear pattern. The seventh explores fly ash to binder ratio effects, with overlapping distributions. The eighth presents water absorption percentage versus strength, showing an inverse tendency. The arrangement reveals how each parameter interacts with compressive strength across fiber categories.Input feature distribution of targeted output CS (MPa)
Source(s): Authors’ own work
The figure consists of eight scatter plots examining the relationship between compressive strength in megapascals and various parameters for five fiber categories: no fiber, steel, polypropylene, other fibers, and jute. The first plot shows compressive strength decreasing with increasing water to binder ratio. The second plot displays varying strengths against curing time, with most data concentrated at shorter durations. The third shows a scattered distribution with recycled aggregate percentage. The fourth presents fiber percentage influence, with scattered trends across types. The fifth compares fiber types directly, showing distinct vertical groupings. The sixth relates coarse aggregate to binder ratio to strength, showing no clear pattern. The seventh explores fly ash to binder ratio effects, with overlapping distributions. The eighth presents water absorption percentage versus strength, showing an inverse tendency. The arrangement reveals how each parameter interacts with compressive strength across fiber categories.Input feature distribution of targeted output CS (MPa)
Source(s): Authors’ own work
The distribution of fibers present in the HFRRAC mixes has been depicted in Figure 2. The data set shows that 28% do not include any fiber. Steel fiber stands out as the most prevalent among the various types of fibers, accounting for 27% of the total observations. The sisal fiber exhibits the lowest count, with 26 test observations among the single fibers. Additionally, two distinct categories of hybrid fibers exist (simultaneously mixing two single fibers), namely, 1% steel and polypropylene and 1% nylon and jute, which are the least prevalent.
The image shows a pie chart presenting the distribution of various fiber types with corresponding percentages. The chart is divided into segments representing No fiber at twenty-one percent, Jute at twenty-seven percent, Steel at thirteen percent, Nylon at six percent, Nylon and Jute at six percent, Steel and Polypropylene at twenty-seven percent, Polypropylene at four percent, and Sisal at one percent. Each category is colour-coded and identified in a legend below, ensuring clarity in distinguishing between the fiber types represented.Distribution of input fiber types in HFRRAC mix data set
Source(s): Authors’ own work
The image shows a pie chart presenting the distribution of various fiber types with corresponding percentages. The chart is divided into segments representing No fiber at twenty-one percent, Jute at twenty-seven percent, Steel at thirteen percent, Nylon at six percent, Nylon and Jute at six percent, Steel and Polypropylene at twenty-seven percent, Polypropylene at four percent, and Sisal at one percent. Each category is colour-coded and identified in a legend below, ensuring clarity in distinguishing between the fiber types represented.Distribution of input fiber types in HFRRAC mix data set
Source(s): Authors’ own work
2.1.2 Correlation among the features
Understanding the correlation between input and output features is crucial, as it helps to understand how variables interact and influence each other to propose a prediction model. Other than influencing the output feature only, some input features may be interdependent. That interdependency among input variables with a high correlation coefficient may lead to multicollinearity, which can cause inefficiency and instability of prediction models.
The statistical metric known as the correlation coefficient (R) is used to quantify how much one variable change in relation to another. Equation 1 guided the Pearson correlation coefficient as the covariance between two variables () divided by their standard deviation (). R has a limit of , where the R value close to or are the stronger linear relationship between two variables:
Table 2 presents the determined Pearson correlation coefficient from equation (1) among the input and output features. The results depict that the curing age () and water to cement ratio () has maximum influence on CS where CS increases in increment of curing age and CS decreases in increment of w/c ratios. Similar results of dependency of CS on these features are also reported by (Poon et al., 2004; Kisku et al., 2017; Pourbaba et al., 2018; Zheng et al., 2018; Sultana et al., 2020; Kang, Yoo and Gupta, 2021; Dabiri et al., 2022b; Zhang et al., 2022a, 2022b; Pakzad et al., 2023).
Pearson correlation coefficient of the input and output features
| Features | Input features | Output features | ||||||
|---|---|---|---|---|---|---|---|---|
| w/b | Curing | RA (%) | Fiber (%) | CA/b | FA/b | WA (%) | CS (MPa) | |
| Input features | ||||||||
| w/b | 1 | 0.035 | −0.054 | −0.139 | 0.524 | 0.428 | −0.034 | −0.814 |
| Curing | 0.035 | 1 | 0.02 | −0.015 | 0.005 | 0.016 | −0.037 | 0.166 |
| RA (%) | −0.054 | 0.02 | 1 | −0.085 | −0.065 | −0.093 | 0.642 | −0.06 |
| Fiber (%) | −0.139 | −0.015 | −0.085 | 1 | −0.192 | −0.164 | −0.111 | 0.156 |
| CA/b | 0.524 | 0.005 | −0.065 | −0.192 | 1 | 0.192 | 0.006 | −0.323 |
| FA/b | 0.428 | 0.016 | −0.093 | −0.164 | 0.192 | 1 | −0.116 | −0.329 |
| WA (%) | −0.034 | −0.037 | 0.642 | −0.111 | 0.006 | −0.116 | 1 | −0.006 |
| Output features | ||||||||
| CS (MPa) | −0.814 | 0.166 | −0.06 | 0.156 | −0.323 | −0.329 | −0.006 | 1 |
| Features | Input features | Output features | ||||||
|---|---|---|---|---|---|---|---|---|
| w/b | Curing | Fiber (%) | CA/b | FA/b | ||||
| Input features | ||||||||
| w/b | 1 | 0.035 | −0.054 | −0.139 | 0.524 | 0.428 | −0.034 | −0.814 |
| Curing | 0.035 | 1 | 0.02 | −0.015 | 0.005 | 0.016 | −0.037 | 0.166 |
| −0.054 | 0.02 | 1 | −0.085 | −0.065 | −0.093 | 0.642 | −0.06 | |
| Fiber (%) | −0.139 | −0.015 | −0.085 | 1 | −0.192 | −0.164 | −0.111 | 0.156 |
| CA/b | 0.524 | 0.005 | −0.065 | −0.192 | 1 | 0.192 | 0.006 | −0.323 |
| FA/b | 0.428 | 0.016 | −0.093 | −0.164 | 0.192 | 1 | −0.116 | −0.329 |
| −0.034 | −0.037 | 0.642 | −0.111 | 0.006 | −0.116 | 1 | −0.006 | |
| Output features | ||||||||
| −0.814 | 0.166 | −0.06 | 0.156 | −0.323 | −0.329 | −0.006 | 1 | |
2.2 Preprocessing of data
Data preprocessing is a task that transforms unclean data into a format that’s suitable for training a ML model. Data from diverse sources often remains in its raw form, posing difficulties in syncing the data with ML models. Unclean and raw data may possess features that are measured in various units, have skewness or non-normality in their distribution and are in an unstructured format, specifically consisting of categorical values. Preprocessing plays a crucial role in transforming raw data into a structured format suitable for the learning and understanding capabilities of ML models. The utilization of preprocessing approaches to transform categorical values into numerical values and standardize data into a similar range has been found to enhance the model’s overall performance.
2.2.1 Categorical encoding
Categorical data is a type of data comprising non-numerical values, such as textual data. ML algorithms are usually outperformed when applied to numerical data, hence categorical data must be converted into numerical format before it is synced into the models, which is commonly referred to as categorical encoding. In this study, only one feature named fiber type contains 5 distinct fiber categories in 634 observations. The feature was encoded using the OneHotEncoding method using the Scikit-learn module in Python. A binary column was created for each category in the original categorical variable. Each subsequent column denotes a distinct category, where a value of 1 denotes that the data point is associated with that category, whereas a value of 0 denotes otherwise. In the case of steel fiber, a number of 1 denotes the fiber’s presence and a value of 0 denotes its absence.
2.2.2 Data set standardization
The analysis of Table 1 shows that the features have varying scales of numerical values, potentially leading to biased outcomes during the training process. Certain features may have a dominant influence on the learning process due to their larger scale. The dominating effect is mitigated through the process of data standardization, which involves bringing all features to a similar scale. A data standardization method called Z-score normalization assigns a mean of 0 and a standard deviation of 1 to the data set. In equation (2), the Z-score normalization technique is presented where scaled data is the difference between the original feature value () and the mean () of the feature values divided by the standard deviation () of the feature values in the data set:
2.3 Model validation and performance metrics
A total of 11 different ML models were implemented in this study. The algorithms used in this study are Linear Regression, Ridge Regression, Lasso Regression, SVM, K-Nearest Neighbor, DT, AdaBoost, RF, CatBoost, XGBoost and LGBM. Model validation is a technique that ensures models perform well on unseen data, avoiding overfitting. It comprises two main steps: data set splitting and cross-validation. The data set was split into 80% for training and 20% for test sets. The training data set is used to train the model and the test data set evaluates its performance against true values.
Performance metrics are quantitative measures used to evaluate the performance of a model on a given data set. These metrics evaluate the model’s performance in terms of its predictive accuracy and generalization ability. Several performance metrics were used in this study to evaluate model performance such as adjusted , mean squared error (MSE), mean absolute error (MAE) and Root MSE (RMSE), which were also used in various studies for concrete strength prediction (Yuan et al., 2014; Getahun et al., 2018; Yu et al., 2018; Kaloop et al., 2020; Salami et al., 2021; Ahmad et al., 2022; Liu, 2022).
is a statistical metric that shows how much of the variance in the dependent variable can be accounted for by the independent variables. A higher value, which ranges from 0 to 1, denotes a better model fit to the data. A modified form of , known as adjusted , penalizes the inclusion of redundant predictors by adjusting for both the number of predictors () and number of observations () in the model. As a result, it becomes a more reliable metric when contrasting models with different numbers of predictors. This study reported only adjusted for the remaining sections. However, MSE calculates the average of the squared differences between predicted () and actual values (), providing a measure of prediction accuracy. It is sensitive to outliers due to the squaring of errors. The residual standard deviation is measured by RMSE, which is the square root of MSE. In contrast, MAE provides a more intuitive explanation of prediction error by calculating the average absolute differences between the values that were predicted and the actual values. The mathematical equations of these performance metrics are presented in equations (3)–(7):
2.4 Hyperparameter tuning
Using GridSearchCV techniques, the best set of hyperparameters was found to maximize the model’s performance. Grid search involves specifying a grid of hyperparameter values to explore and for each combination of the grid, 5-fold cross-validation was performed. With this method, the training data were split into 5 equal folds. The models were trained on 4 of 5 folds of the data and validated on the remaining fold, with the process rotated through all folds and repeated with all of the combinations of the hyperparameter grid. Aggregating the results from the cross-validation, an optimal combination of hyperparameters was found, and then the final models were trained with those hyperparameters. All the hyperparameters used in each model are presented in Table 3.
Implemented algorithms and their hyperparameter values
| No. | Models | Parameters | Values | Standard range |
|---|---|---|---|---|
| 1 | RR | alpha | 10.0 | 0.1–100 (log scale) |
| fit_intercept | True | Boolean (true/false) | ||
| solver | Sparse_cg | {“auto”, “svd”, “cholesky”, “lsqr”, “sparse_cg”, “sag”, “saga”} | ||
| 2 | Lasso | alpha | 0.1 | 0.0001–1 (log scale) |
| fit_intercept | True | Boolean (true/false) | ||
| selection | random | {“cyclic”, “random”} | ||
| 3 | SVM | c | 100 | 0.1–1000 (log scale) |
| gamma | scale | {“scale”, “auto”} or float (0.001–10) | ||
| kernel | rbf | {“linear”, “poly”, “rbf”, “sigmoid”} | ||
| 4 | KNN | algorithm | brute | {“auto”, “ball_tree”, “kd_tree”, “brute”} |
| leaf_size | 10 | 1–100 | ||
| metric | Manhattan | {“euclidean”, “manhattan”, “minkowski”} | ||
| n_neighbors | 3 | 1–20 | ||
| p | 1 | 1 (manhattan), 2 (euclidean) | ||
| weights | distance | {“uniform”, “distance”} | ||
| 5 | DT | criterion | squared_error | {“squared_error”, “friedman_mse”, “absolute_error”} |
| max_depth | 20 | 1–100 | ||
| max_features | sqrt | {“sqrt”, “log2”, None} or int/float | ||
| max_leaf_nodes | None | 2–infinity or None | ||
| min_sample_leaf | 1 | 1–20 | ||
| min_sample_split | 2 | 2–20 | ||
| random_state | 42 | Fixed seed | ||
| splitter | best | {“best”, “random”} | ||
| 6 | RF | max_depth | 30 | 1–100 |
| max_features | sqrt | {“sqrt”, “log2”, None} or int/float | ||
| min_sample_leaf | 1 | 1–20 | ||
| min_sample_split | 2 | 2–20 | ||
| n_estimator | 200 | 10–1,000 | ||
| 7 | AB | learning_rate | 1.0 | 0.01–1 |
| n_estimator | 50 | 10–1,000 | ||
| random_state | 72 | Fixed seed | ||
| 8 | XGB | colsample_bytree | 0.4 | 0.1–1 |
| gamma | 0.1 | 0–infinity (typically 0–5) | ||
| learning_rate | 0.15 | 0.001–0.3 | ||
| max_depth | 6 | 1–20 | ||
| min_child_weight | 1 | 0–infinity (typically 1–10) | ||
| 9 | CB | learning_rate | 0.3 | 0.001–0.3 |
| n_estimator | 100 | 10–1,000 | ||
| random_state | 50 | Fixed seed | ||
| 10 | LGBM | learning_rate | 0.1 | 0.001–0.3 |
| max_depth | 5 | 1–20 | ||
| min_data_in_leaf | 20 | 1–100 | ||
| n_estimator | 450 | 10–1,000 |
| No. | Models | Parameters | Values | Standard range |
|---|---|---|---|---|
| 1 | alpha | 10.0 | 0.1–100 (log scale) | |
| fit_intercept | True | Boolean (true/false) | ||
| solver | Sparse_cg | {“auto”, “svd”, “cholesky”, “lsqr”, “sparse_cg”, “sag”, “saga”} | ||
| 2 | Lasso | alpha | 0.1 | 0.0001–1 (log scale) |
| fit_intercept | True | Boolean (true/false) | ||
| selection | random | {“cyclic”, “random”} | ||
| 3 | c | 100 | 0.1–1000 (log scale) | |
| gamma | scale | {“scale”, “auto”} or float (0.001–10) | ||
| kernel | rbf | {“linear”, “poly”, “rbf”, “sigmoid”} | ||
| 4 | algorithm | brute | {“auto”, “ball_tree”, “kd_tree”, “brute”} | |
| leaf_size | 10 | 1–100 | ||
| metric | Manhattan | {“euclidean”, “manhattan”, “minkowski”} | ||
| n_neighbors | 3 | 1–20 | ||
| p | 1 | 1 (manhattan), 2 (euclidean) | ||
| weights | distance | {“uniform”, “distance”} | ||
| 5 | criterion | squared_error | {“squared_error”, “friedman_mse”, “absolute_error”} | |
| max_depth | 20 | 1–100 | ||
| max_features | sqrt | {“sqrt”, “log2”, None} or int/float | ||
| max_leaf_nodes | None | 2–infinity or None | ||
| min_sample_leaf | 1 | 1–20 | ||
| min_sample_split | 2 | 2–20 | ||
| random_state | 42 | Fixed seed | ||
| splitter | best | {“best”, “random”} | ||
| 6 | max_depth | 30 | 1–100 | |
| max_features | sqrt | {“sqrt”, “log2”, None} or int/float | ||
| min_sample_leaf | 1 | 1–20 | ||
| min_sample_split | 2 | 2–20 | ||
| n_estimator | 200 | 10–1,000 | ||
| 7 | learning_rate | 1.0 | 0.01–1 | |
| n_estimator | 50 | 10–1,000 | ||
| random_state | 72 | Fixed seed | ||
| 8 | colsample_bytree | 0.4 | 0.1–1 | |
| gamma | 0.1 | 0–infinity (typically 0–5) | ||
| learning_rate | 0.15 | 0.001–0.3 | ||
| max_depth | 6 | 1–20 | ||
| min_child_weight | 1 | 0–infinity (typically 1–10) | ||
| 9 | learning_rate | 0.3 | 0.001–0.3 | |
| n_estimator | 100 | 10–1,000 | ||
| random_state | 50 | Fixed seed | ||
| 10 | learning_rate | 0.1 | 0.001–0.3 | |
| max_depth | 5 | 1–20 | ||
| min_data_in_leaf | 20 | 1–100 | ||
| n_estimator | 450 | 10–1,000 |
3. Result and discussion
3.1 Model performance
This section presents the performance metrics of 11 implemented models for the CS prediction of HFRRAC. Table 4 depicts the performance metrics of the implemented models for both the train and test phase scores. As seen, tree-based ensemble models such LGBM, CB, XGB and RF have superior performance among all other models. LGBM performed the best in both the train and test phases with Adj. score of 0.988 and 0.969. It also exhibits better performance against overfitting as it poses a lower difference between train-test scores. LGBM efficiently captured the nonlinear nature of HFRRAC data as it uses a learning rate to control the contribution of each tree, balancing the bias-variance tradeoff and improving generalization, which provides lower MSE, MAE and RMSE scores of 14.04, 2.76 and 3.75. It is worth mentioning that other ensemble models, such as CB, XGB and RF can also handle complex and nonlinear feature interaction of data with higher Adj. score of 0.967, 0.966 & 0.966 in the test phase.
Performance metrics of each model in both train and test phases
| Model | Phase | Adj | MSE | MAE | RMSE |
|---|---|---|---|---|---|
| LR | Train | 0.753 | 114.88 | 8.1 | 10.74 |
| Test | 0.72 | 126.19 | 8.55 | 11.23 | |
| RR | Train | 0.754 | 114.84 | 8.11 | 10.72 |
| Test | 0.721 | 125.91 | 8.53 | 11.22 | |
| Lasso | Train | 0.755 | 114.87 | 8.12 | 10.73 |
| Test | 0.725 | 123.97 | 8.54 | 11.13 | |
| SVM | Train | 0.948 | 33.5 | 3.17 | 5.79 |
| Test | 0.936 | 28.77 | 3.61 | 5.36 | |
| KNN | Train | 0.973 | 12.74 | 1.28 | 3.57 |
| Test | 0.936 | 29.05 | 3.86 | 5.39 | |
| DT | Train | 0.962 | 31.73 | 3.46 | 5.63 |
| Test | 0.958 | 18.87 | 3.14 | 4.34 | |
| RF | Train | 0.98 | 9.09 | 1.83 | 3.01 |
| Test | 0.966 | 15.5 | 2.86 | 3.94 | |
| AB | Train | 0.865 | 63.11 | 6.65 | 7.94 |
| Test | 0.871 | 58.08 | 6.26 | 7.62 | |
| XGB | Train | 0.982 | 8.42 | 1.63 | 2.9 |
| Test | 0.966 | 15.46 | 2.71 | 3.93 | |
| CB | Train | 0.986 | 6.37 | 1.35 | 2.52 |
| Test | 0.967 | 14.87 | 2.64 | 3.86 | |
| LGBM | Train | 0.988 | 5.78 | 1.7 | 2.4 |
| Test | 0.969 | 14.04 | 2.76 | 3.75 |
| Model | Phase | Adj | |||
|---|---|---|---|---|---|
| Train | 0.753 | 114.88 | 8.1 | 10.74 | |
| Test | 0.72 | 126.19 | 8.55 | 11.23 | |
| Train | 0.754 | 114.84 | 8.11 | 10.72 | |
| Test | 0.721 | 125.91 | 8.53 | 11.22 | |
| Lasso | Train | 0.755 | 114.87 | 8.12 | 10.73 |
| Test | 0.725 | 123.97 | 8.54 | 11.13 | |
| Train | 0.948 | 33.5 | 3.17 | 5.79 | |
| Test | 0.936 | 28.77 | 3.61 | 5.36 | |
| Train | 0.973 | 12.74 | 1.28 | 3.57 | |
| Test | 0.936 | 29.05 | 3.86 | 5.39 | |
| Train | 0.962 | 31.73 | 3.46 | 5.63 | |
| Test | 0.958 | 18.87 | 3.14 | 4.34 | |
| Train | 0.98 | 9.09 | 1.83 | 3.01 | |
| Test | 0.966 | 15.5 | 2.86 | 3.94 | |
| Train | 0.865 | 63.11 | 6.65 | 7.94 | |
| Test | 0.871 | 58.08 | 6.26 | 7.62 | |
| Train | 0.982 | 8.42 | 1.63 | 2.9 | |
| Test | 0.966 | 15.46 | 2.71 | 3.93 | |
| Train | 0.986 | 6.37 | 1.35 | 2.52 | |
| Test | 0.967 | 14.87 | 2.64 | 3.86 | |
| Train | 0.988 | 5.78 | 1.7 | 2.4 | |
| Test | 0.969 | 14.04 | 2.76 | 3.75 |
In contrast, the simplest model, LR, RR and Lasso perform poorly in the train (Adj. = 0.753, 0.754, 0.755) and test (Adj. = 0.72, 0.721, 0.725) phase. These simplest models cannot capture the complex pattern of HFRRAC’s nonlinear data. These models are inherently built with a simple linear relationship between input and output features and are incompetent to handle the high dimensionality of complex data. Several studies also reported the inefficiency of these linear models with low and high MSE scores (Khademi et al., 2016; Kang et al., 2021; Patil et al., 2023; Bansal et al., 2024).
3.1.1 Rank-wise analysis
Implemented models were further subjected to a comparative assessment using a method called ‘score analyses’, as employed by Pal et al. (2023). According to this approach, each model was scored on a scale of 1 to N (where n = 11, the total number of implemented models) in both the test and train phases. The model that performed the best was given a score of 11, while the model that did the poorest scored 1. The total score was determined by aggregating the performance metrics attained by each model separately for both the test and train phases. The final score of a model was obtained by summing the test and train scores. The model with the highest final score was then assigned the final rank of 1, while the model with the lowest score was assigned 11. Table 5 demonstrates that LGBM achieved outstanding performance in both the test and train phases, securing the top position with a total score of 83. Besides LGBM, CB achieved close scores of 41 and 40 during the test and train phase, positioning it as the second top performer with a total final score of 81. LR has the lowest rank of 11, making it the model with the poorest performance. LGBM is the highest performing model, followed by CB, XGB and RF, which are in second, third and fourth place, respectively. This ranking could be considered pivotal for mix design, quality control and structural safety, for selecting efficient models for reliable CS prediction in HFRRAC while reducing dependence on physical testing.
Rank-wise score analysis
| Models | Phase | Adj | MSE | MAE | RMSE | Total score | Final score | Final rank |
|---|---|---|---|---|---|---|---|---|
| LR | Train | 1 | 1 | 3 | 1 | 6 | 10 | 11 |
| Test | 1 | 1 | 1 | 1 | 4 | |||
| RR | Train | 2 | 3 | 2 | 3 | 10 | 19 | 9 |
| Test | 2 | 2 | 3 | 2 | 9 | |||
| Lasso | Train | 3 | 2 | 1 | 2 | 8 | 19 | 9 |
| Test | 3 | 3 | 2 | 3 | 11 | |||
| SVM | Train | 5 | 5 | 6 | 5 | 21 | 44 | 7 |
| Test | 5 | 6 | 6 | 6 | 23 | |||
| KNN | Train | 7 | 7 | 11 | 7 | 32 | 52 | 5 |
| Test | 5 | 5 | 5 | 5 | 20 | |||
| DT | Train | 6 | 6 | 5 | 6 | 23 | 51 | 6 |
| Test | 7 | 7 | 7 | 7 | 28 | |||
| RF | Train | 8 | 8 | 7 | 8 | 31 | 63 | 4 |
| Test | 8 | 8 | 8 | 8 | 32 | |||
| AB | Train | 4 | 4 | 4 | 4 | 16 | 32 | 8 |
| Test | 4 | 4 | 4 | 4 | 16 | |||
| XGB | Train | 9 | 9 | 9 | 9 | 36 | 72 | 3 |
| Test | 8 | 9 | 10 | 9 | 36 | |||
| CB | Train | 10 | 10 | 10 | 10 | 40 | 81 | 2 |
| Test | 10 | 10 | 11 | 10 | 41 | |||
| LGBM | Train | 11 | 11 | 8 | 11 | 41 | 83 | 1 |
| Test | 11 | 11 | 9 | 11 | 42 |
| Models | Phase | Adj | Total score | Final score | Final rank | |||
|---|---|---|---|---|---|---|---|---|
| Train | 1 | 1 | 3 | 1 | 6 | 10 | 11 | |
| Test | 1 | 1 | 1 | 1 | 4 | |||
| Train | 2 | 3 | 2 | 3 | 10 | 19 | 9 | |
| Test | 2 | 2 | 3 | 2 | 9 | |||
| Lasso | Train | 3 | 2 | 1 | 2 | 8 | 19 | 9 |
| Test | 3 | 3 | 2 | 3 | 11 | |||
| Train | 5 | 5 | 6 | 5 | 21 | 44 | 7 | |
| Test | 5 | 6 | 6 | 6 | 23 | |||
| Train | 7 | 7 | 11 | 7 | 32 | 52 | 5 | |
| Test | 5 | 5 | 5 | 5 | 20 | |||
| Train | 6 | 6 | 5 | 6 | 23 | 51 | 6 | |
| Test | 7 | 7 | 7 | 7 | 28 | |||
| Train | 8 | 8 | 7 | 8 | 31 | 63 | 4 | |
| Test | 8 | 8 | 8 | 8 | 32 | |||
| Train | 4 | 4 | 4 | 4 | 16 | 32 | 8 | |
| Test | 4 | 4 | 4 | 4 | 16 | |||
| Train | 9 | 9 | 9 | 9 | 36 | 72 | 3 | |
| Test | 8 | 9 | 10 | 9 | 36 | |||
| Train | 10 | 10 | 10 | 10 | 40 | 81 | 2 | |
| Test | 10 | 10 | 11 | 10 | 41 | |||
| Train | 11 | 11 | 8 | 11 | 41 | 83 | 1 | |
| Test | 11 | 11 | 9 | 11 | 42 |
3.2 Rate of residual error
The difference between the actual and predicted CS is often referred as residual error. Minimizing residual error in CS prediction is critical for enhancing safety margins, particularly for recycled concrete, where variability because of material source and quality is common. This section only considers residual errors in test phase. However, the distribution of residual error in terms of interquartile range (IQR) for each model is illustrated in Figure 3. It is seen that XGB has the lowest IQR, followed by CB, LGBM and RF. On the other hand, LR, RR and Lasso have the highest IQR, indicating that they are the least effective models. While LGBM has a slightly higher IQR compared to XGB and CB, it also has the lowest number of outliers, making it a more significant and reliable performance. Figure 4 illustrates the kernel density estimation, which depicts the normal distribution of the data for the models. The model with the lowest skewness and higher peak closer to zero exhibits a higher degree of normality. Though CB and XGB had higher peak closer to zero, LGBM exhibits a lower degree of skewness, indicating a more normal distribution. Hence, adopting such models can enhance the precision of performance-based concrete design in sustainable construction.
he box plot displays residual error distributions for eleven models: Linear Regression, Ridge Regression, Lasso Regression, Support Vector Regression, K-Nearest Neighbors, Decision Trees, Random Forests, AdaBoost, Extreme Gradient Boosting, CatBoost, and LightGBM. The horizontal axis lists the models, while the vertical axis represents residual error values. Each box indicates the interquartile range, with horizontal lines showing medians and whiskers extending to minimum and maximum values excluding outliers. Outliers are plotted as individual points. Variations in box heights and median positions highlight differences in error spread and central tendencies across the models.Distribution of residual error for each model
Source(s): Authors’ own work
he box plot displays residual error distributions for eleven models: Linear Regression, Ridge Regression, Lasso Regression, Support Vector Regression, K-Nearest Neighbors, Decision Trees, Random Forests, AdaBoost, Extreme Gradient Boosting, CatBoost, and LightGBM. The horizontal axis lists the models, while the vertical axis represents residual error values. Each box indicates the interquartile range, with horizontal lines showing medians and whiskers extending to minimum and maximum values excluding outliers. Outliers are plotted as individual points. Variations in box heights and median positions highlight differences in error spread and central tendencies across the models.Distribution of residual error for each model
Source(s): Authors’ own work
The density plot presents residual error distributions for Linear Regression, Ridge Regression, Lasso, Support Vector Regression, K-Nearest Neighbors, Decision Trees, Random Forests, AdaBoost, X G Boost, CatBoost, and Light G B M. The horizontal axis ranges from negative 40 to positive 40, showing residual error values, while the vertical axis represents density up to approximately 0.14. Each model is depicted with a distinct curve, some of which overlap, reflecting similarities in error distributions. The legend associates each curve with its respective model, enabling clear comparison of performance patterns and variations in error spread across the models.Kernel density estimation of residual error
Source(s): Authors’ own work
The density plot presents residual error distributions for Linear Regression, Ridge Regression, Lasso, Support Vector Regression, K-Nearest Neighbors, Decision Trees, Random Forests, AdaBoost, X G Boost, CatBoost, and Light G B M. The horizontal axis ranges from negative 40 to positive 40, showing residual error values, while the vertical axis represents density up to approximately 0.14. Each model is depicted with a distinct curve, some of which overlap, reflecting similarities in error distributions. The legend associates each curve with its respective model, enabling clear comparison of performance patterns and variations in error spread across the models.Kernel density estimation of residual error
Source(s): Authors’ own work
3.3 Sensitivity analysis
Sensitivity analysis demonstrates the absolute importance of the input feature influencing the CS. Figure 5 depicted the feature importance of input variables that impact CS of HFRRAC for SVR, KNN, DT, RF, AB, XGB, CB and LGBM. Absolute importance is given in the range of 0%–100%. The variation in feature importance across models is attributed to differences in their internal architecture and learning techniques. Despite these differences, it is clearly seen that among different input features w/b are the most influential features for CS of HFRRAC in most of the models. This finding aligned with fundamental concrete mix design principles where the increase of w/b significantly decreases the CS of concrete (Aïtcin, 2016; Kaplan et al., 2022; Kaur et al., 2023). However, FA/b, CA/b, curing, RA (%) and fiber (%) also have significant importance among the models. Steel fiber demonstrated the highest influence among other fibers in our models. This result is consistent with its theoretical advantages, such as the superior tensile strength and stiffness for crack bridging (Kang et al., 2017), and the synergistic effects in hybrid fiber, where steel fibers dominate macro-crack resistance while other fibers (e.g., polypropylene) control micro-cracking (He et al., 2020).
The figure contains six bar charts, each representing the percentage importance of input variables for Decision Tree, Random Forest, AdaBoost, X G Boost, C a t Boost, and Light G B M models. Variables include water-to-binder ratio, fine aggregate-to-binder ratio, coarse aggregate-to-binder ratio, curing time, recycled aggregate percentage, fibre percentage, and steel fibre. Across most models, water-to-binder ratio shows the highest importance, particularly in Decision Tree, Random Forest, and A d a Boost. Light G B M displays a more balanced importance distribution among variables, while C a t Boost and X G Boost also rank water-to-binder ratio highest but with notable contributions from fine aggregate-to-binder ratio and curing time.Feature importance of input variables influencing CS for each model
Source(s): Authors’ own work
The figure contains six bar charts, each representing the percentage importance of input variables for Decision Tree, Random Forest, AdaBoost, X G Boost, C a t Boost, and Light G B M models. Variables include water-to-binder ratio, fine aggregate-to-binder ratio, coarse aggregate-to-binder ratio, curing time, recycled aggregate percentage, fibre percentage, and steel fibre. Across most models, water-to-binder ratio shows the highest importance, particularly in Decision Tree, Random Forest, and A d a Boost. Light G B M displays a more balanced importance distribution among variables, while C a t Boost and X G Boost also rank water-to-binder ratio highest but with notable contributions from fine aggregate-to-binder ratio and curing time.Feature importance of input variables influencing CS for each model
Source(s): Authors’ own work
3.4 Model comparison and statistical scores
Figure 6 compares each implemented model by true vs predicted CS plot of HFRRAC, whereas Figure 7 presents statistical scores of models. The slope of the experimental vs predicted values is denoted as “m”. The slope, which shows how much the prediction accuracy is steepened, is the change in the predicted CS for a one-unit increase in the true CS. The model’s precision is indicated by a value of m that is closer to 1. Underprediction is indicated by a m value less than 1, and overprediction is shown by a m value greater than 1. The data indicates that CB outperformed other models in terms of m value, with the highest m value of 0.962 where LGBM had the closest value of 0.961.
The figure presents scatter plots for eleven models: Linear Regression, Ridge Regression, Lasso Regression, Support Vector Regression, K Nearest Neighbors, Decision Tree, Random Forest, Ada Boost, X G Boost, Cat Boost and Light G B M, comparing experimental and predicted values. Each plot has experimental values on the horizontal axis and predicted values on the vertical axis, with a diagonal reference line indicating perfect prediction. Trend lines are fitted to the data points, most of which closely follow the diagonal, particularly for tree based ensemble models such as Random Forest, X G Boost, Cat Boost and Light G B M. This close alignment indicates high predictive accuracy, with smaller deviations for these models compared to linear regression based approaches.Scattered plot for actual vs predicted CS for each model
Source(s): Authors’ own work
The figure presents scatter plots for eleven models: Linear Regression, Ridge Regression, Lasso Regression, Support Vector Regression, K Nearest Neighbors, Decision Tree, Random Forest, Ada Boost, X G Boost, Cat Boost and Light G B M, comparing experimental and predicted values. Each plot has experimental values on the horizontal axis and predicted values on the vertical axis, with a diagonal reference line indicating perfect prediction. Trend lines are fitted to the data points, most of which closely follow the diagonal, particularly for tree based ensemble models such as Random Forest, X G Boost, Cat Boost and Light G B M. This close alignment indicates high predictive accuracy, with smaller deviations for these models compared to linear regression based approaches.Scattered plot for actual vs predicted CS for each model
Source(s): Authors’ own work
The figure presents three radar charts labelled (a), (b) and (c), each comparing the performance of eleven machine learning models: L R, Light G B M, Cat Boost, X G Boost, Ada Boost, Random Forest, Decision Tree, K Nearest Neighbours, Support Vector Regression, Lasso and Ridge Regression. In each chart, polygons represent the performance metrics for the models, overlapping in different configurations. Panels (a) and (b) have y axis values ranging from 0 to 1, while panel (c) ranges from 0 to 0.4. Concentric grid lines provide reference levels for performance, and shaded areas within polygons indicate the extent of each model’s performance.Radar chart for statistical score of each model (a) slope, (b) correlation coefficient, (c) standard error
Source(s): Authors’ own work
The figure presents three radar charts labelled (a), (b) and (c), each comparing the performance of eleven machine learning models: L R, Light G B M, Cat Boost, X G Boost, Ada Boost, Random Forest, Decision Tree, K Nearest Neighbours, Support Vector Regression, Lasso and Ridge Regression. In each chart, polygons represent the performance metrics for the models, overlapping in different configurations. Panels (a) and (b) have y axis values ranging from 0 to 1, while panel (c) ranges from 0 to 0.4. Concentric grid lines provide reference levels for performance, and shaded areas within polygons indicate the extent of each model’s performance.Radar chart for statistical score of each model (a) slope, (b) correlation coefficient, (c) standard error
Source(s): Authors’ own work
The strength of the linear relationship or how well the predicted values match the true values is indicated by the correlation coefficient or R. A strong relationship between the predicted and true values is indicated by a high positive R-value that is close to +1. Conversely, a low R-value, approaching zero, reveals a minimal or nonexistent association between the true and predicted values. Regarding R values, LGBM had superior performance compared to other models with R-value of 0.986, although both the CB and XGB demonstrated closer R values of 0.985.
In contrast, the standard error serves as a metric for quantifying the dispersion of the predicted values around the regression line by showing the standard deviation of residuals. The standard error of the regression provides an estimation of the mean deviation between the predicted values and the true values. A smaller standard error of the regression signifies that the projected values exhibit a higher degree of proximity to the true values, hence indicating a more suitable fit of the regression model to the data. On the other hand, a higher standard error implies increased dispersion in the predicted values around the regression line, indicating a less satisfactory suitability of the model. Figure 7 illustrates that LGBM, followed by XGB and CB, has the lowest standard error, indicating reduced variability and better model fit.
4. Conclusion
This study analyzed 634 comprehensive data sets from various published experimental studies using 11 ML models to predict the CS of FRRAC. The hyperparameters were implemented by the grid search technique, and the models were optimized by a 5-fold cross-validation technique to encounter overfitting issues. The research finally comes to the following key findings:
Among the eleven implemented ML models, LGBM emerged as the top performer, achieving an adjusted of 0.969, an MSE of 14.04, an MAE of 2.76 and an RMSE of 3.75. CatBoost followed closely with an adjusted of 0.967, an MSE of 14.87, an MAE of 2.64 and an RMSE of 3.86. Additionally, XGBoost emerged as the closest competitor to LGBM and CB.
The least effective models are Linear, Ridge and Lasso Regression, each with an adjusted score below 0.80. These models also exhibit high error metrics, with MSE exceeding 110, MAE over 8 and root MSE above 10. Their simplicity prevents them from capturing the complex, non-linear relationships within the data, making them unsuitable for predicting the CS of HFRRAC.
Among all the tree-based ensemble models, AdaBoost is the worst performer and ranked 8th among 11 ML models as it uses weak learners like decision stumps, may struggle with complex feature interactions and risks overfitting. In contrast, other ensemble models capture these interactions more effectively with deeper trees.
According to sensitivity analysis, w/b is the most influential parameter for CS. While w/b increases, CS is found to decrease; the CS increases while the w/b decreases. CA/b, FA/b, RA (%) and curing age were also found to be important parameters for CS.
Steel fiber was found to be the most influential among all other fibers. Higher dosages of some fibers such as polypropylene and nylon led to a reduction of CS. But by using hybrid fiber, these issues have been overcome and had an improvement over the CS of HFRRAC.
This study successfully builds an alternative method of predicting the CS of HFRRAC using a data-driven technique with different machine-learning models. A strong correlation between input and output features and high prediction accuracy assist the civil engineering community in selecting the most optimal mix of design and input attributes for subsequent modeling. However, this study lacks some key parameters, such as the inclination angle and fiber orientation profile of fibers accounting for isotropic effects (influence on mechanical strength) in HFRRAC. Future research could be focused on the inclusion of these parameters in the prediction model.

