Data-driven prediction of concrete strength by machine learning: hybrid-fiber-reinforced recycled aggregate concrete

Hasan, Md Rabiul; Shuvo, Aojoy Kumar; Pranto, Ehsanul Bashar; Hasan, Mehedi; Miah, Md Mintu

doi:10.1108/WJE-01-2025-0038

Purpose

The purpose of this study is to predict the compressive strength of hybrid-fiber-reinforced recycled aggregate concrete (HFRRAC) using machine learning (ML) models. The aim is to address the challenges of optimizing HFRRAC mix designs by modeling the complex and non-linear relationships between mixing ingredients and compressive strength.

Design/methodology/approach

A data set comprising 634 samples of HFRRAC mix proportions and corresponding compressive strength values was compiled from recent research studies. A total of 11 ML models were used for compressive strength prediction, and the models’ performance was evaluated by using metrics such as R², mean squared error(MSE), mean absolute error (MAE) and root MSE (RMSE). Feature importance analysis was also performed to identify key predictors.

Findings

Light gradient boosting machine demonstrated the highest accuracy with an R² of 0.969, an MSE of 14.04, an MAE of 2.76 and an RMSE of 3.75, making it the most efficient model for predicting HFRRAC compressive strength. CatBoost was identified as the second-best performer. Feature importance analysis highlighted water-to-binder ratio, coarse aggregate-to-binder ratio, fine aggregate-to-binder ratio, recycled aggregate replacement rate and curing age as critical factors influencing compressive strength.

Originality/value

This study demonstrates the application of ML techniques to predict HFRRAC compressive strength with high accuracy, offering a reliable approach to mix design optimization.

1. Introduction

As economic development progresses and urbanization accelerates, the construction of new buildings is on the rise, leading to the generation of significant amounts of construction and demolition waste (CDW). This waste was commonly managed through simplistic and cost-effective methods such as open dumping and landfilling, resulting in severe environmental ramifications (Zari, 2024). The disposal of CDW poses difficulties in waste management, as landfills take up valuable land area and pose potential hazards of soil and groundwater pollution (Chen et al., 2021; Osra et al., 2024). Conversely, recycled aggregate (RA) produced from CDW provides a sustainable alternative. This approach decreases the need for new natural resources and minimizes environmental impact by reducing greenhouse gas emissions, lowering energy consumption associated with extracting virgin aggregates and preventing land contamination from CDW (Silva et al., 2019; Atasham ul haq et al., 2024). As an economical alternative to natural aggregates (NA), RA has the potential to replace traditional crushed stone and gravel partially.

One of the primary concerns of RA-based concrete is the substantial mechanical strength reduction, which requires a particular treatment procedure to increase its strength (Li et al., 2024; Wu et al., 2024; Xi et al., 2024). However, various treatment methods have been suggested by researchers over the decades to improve the mechanical strength and durability concerns of recycled aggregate concrete (RAC), such as chemical treatment, polymer emulsion, pozzolanic slurry, calcium carbonate bio-deposition, sodium silicate solution and the addition of different fibers in RAC (Meesala, 2019; Mistri et al., 2020; Wang et al., 2021; Rebai et al., 2024). However, most of the methods have some drawbacks. Chemical treatment such as presoaking in acid solution is not economically viable (Tam et al., 2007), polymer emulsion pollutes the environment (Spaeth and Djerbi Tegguer, 2013; Kaushik and Bhan, 2024), pozzolanic slurry can reduce the fluidity (Shi et al., 2018), calcium carbonate bio-deposition increases the number of small pores (Grabiec et al., 2012; Qiu et al., 2014; Shi et al., 2016) and sodium silicate solution cannot improve water penetration and may lead to alkali-silica reactions in concrete (Shi et al., 2016; Pan et al., 2017). On the other hand, adding different fibers with RA in recycled concrete can significantly improve the performance of RAC when encountering environmental and cost constraints. Several research studies have shown that fiber reinforcement with concrete can improve creep behavior and splitting tensile and flexural strength (Lee and Choi, 2013; Lyu et al., 2019). Additionally, under various loading circumstances, fiber can improve the concrete’s ductility and toughness.

Kang et al. (2017) found that the flexural strength of RAC beams with 100% RA and 0.15% steel fiber became comparable to natural aggregate concrete (NAC) beams. The findings of Carneiro et al. (2014) showed that the tensile strength of RAC increased by 26% for the incorporation of 0.75% of volumetric doses of steel fiber. Akça et al. (2015) also concluded that optimal performance could be achievable for RAC when incorporating 1% of polypropylene fiber. An experimental study by Ali et al. (2022) reported a 7% increment of compressive strength of RAC when RAC was produced with 0.1% doses of nylon fiber. Moreover, synthetic fibers such as polypropylene and nylon enhance concrete but raise ecological concerns, while natural alternatives such as alfa fibers offer sustainable improvements in tensile strength (up to 54.41%) with less environmental impact (Mamen et al., 2022). It is also considered that the combination of two or more fibers, or a hybrid fiber, can be more effective than a single fiber. He et al. (2020) found that the hybrid fibers consisting of steel and polypropylene showed better performance, significantly improving the compressive and flexural ductility of RAC in comparison to only steel fiber incorporated in RAC. An experimental study by Cui et al. (2023) showed that a multiscale hybrid fiber could alter the failure patterns and increase RAC’s compressive strength. In addition, the ability of fiber reinforcement in RAC is not only limited to enhancing the mechanical strength; it can also decrease the probability of concrete cracking by restraining shrinkage and expanding thermal stability (Asim et al., 2020; Nassif et al., 2022). Several studies suggested that the fiber reinforcement could also minimize concrete permeability and increase freeze-thaw resistance (Dong et al., 2022; Han et al., 2022).

Though the integration of fiber into concrete was initially intended to improve cracking behavior by increasing tensile and flexural strength (Abbas and Iqbal Khan, 2016; Pakravan et al., 2017), it also demonstrated the enhancement of compressive strength (CS). Furthermore, the relationship of CS with the fiber dosages and RA replacement rate in fiber-reinforced recycled aggregate concrete (FRRAC) is nonlinear (Zhang et al., 2021; Zaid et al., 2022), which eventually results in complexities in predicting CS, especially for hybrid fibers. However, it was crucial to establish an efficient analytical method for predicting CS of FRRAC since the mechanical testing method is prevalent but expensive, laborious, resource-intensive and time-consuming. In past decades, the researcher proposed several empirical methods for predicting CS. Still, these approaches were designed for typical concrete and fail to generalize to other types.

In contrast, data-driven strategies such as artificial intelligence (AI) or machine learning (ML) for predicting CS have gained much attention from researchers in recent years. Data-driven ML models can efficiently predict CS apart from the heterogeneous nature and nonlinear relationship among the mixing ingredients of concrete, outperforming traditional empirical methods (Chen et al., 2021; Pakzad et al., 2023; Pal et al., 2023; Alabduljabbar et al., 2024) with high generalization capabilities. Popular ML models frequently used for predicting CS include support vector machine (SVM), k-nearest neighbor (KNN), decision tree (DT), adaptive boosting (AB), random forest (RF), extreme gradient boosting (XGB), categorical boosting (CB), light gradient boosting machine (LGBM), artificial neural network (ANN), convolution neural network (CNN). For instance, Ekanayake et al. (Ekanayake et al., 2022) employed tree as well as gradient boosting and Laplacian Kernel Regression-based ML algorithm to predict CS of concrete and found XGB as the most efficient model. However, researchers have overlooked CS prediction for hybrid FRRAC (HFRRAC). Existing data-driven ML models of CS prediction for HFRRAC are limited to single fiber-based and focused only on simple input parameters. For instance, one previous study by Pakzad et al. (Pakzad et al., 2023) used a CNN to predict the CS of steel FRRAC having $r^{2}$ score of 0.92. Shafighfard et al. (2022) found the stacked ML model efficient for steel fiber-reinforced concrete. Additionally, Li et al. (2022) found that RF was a good fit for predicting the properties of basalt fiber-based concrete. Furthermore, Kang et al. (2021) reported that XGB had a low RMSE score of 3.61 for steel fiber-reinforced concrete. These studies developed models for single-fiber-based concrete only. However, there is a notable research gap regarding the combined impact of different fibers on the CS of recycled aggregate concrete.

This study has attempted to establish an efficient computational model for capturing the nonlinear relationship between CS and the mixing ingredients of HFRRAC. This study used ensemble models such as RF, AdaBoost, CatBoost, XGBoost, LGBM along with tree-based models such as DT and simpler models such as Linear, Ridge, Laaso regression, SVM and KNN. By evaluating the statistical performance of these models, this study proposed the best-performing model and the top contributing factor. Thus, the key objective of this study is to suggest the most efficient data-driven prediction model for the CS of HFRRAC by comparing the statistical performance of different ML algorithms.

1.1 Background of prediction models

The conventional destructive method for testing concrete’s CS is expensive, time-consuming and typically requires an extended curing period. In that approach, a concrete specimen, such as a cylinder or cube, is subjected to uniformly distributed axial compression until it fails. Conversely, researchers have long sought a faster and more cost-effective method to predict the CS of concrete. To estimate CS, various methods have been used, including computational, numerical and analytical models (Nithurshan and Elakneswaran, 2023).

Early attempts to predict CS were made to build a correlation between mixing ingredients, such as water-to-cement ratio and CS. Feret (1892) established a quantitative relationship between the CS of concrete and its paste composition. His model emphasized the ratios of key components in concrete, particularly the volumetric percentage of cement, water and air. He proposed that the CS of concrete primarily depends on the volume fractions of cement paste and air. This approach aimed to capture the influence of the relative proportions of these components on the overall strength. However, this model showed poor agreement with experimental data, indicating that the theoretical relationship did not accurately predict the actual strength of concrete mixes and the proportionality coefficients of this model had to be ascertained for every distinct trial mixture, which was laborious and time-consuming and made the model impractical (Popovics, 1985). Unlike Feret’s model, Abrams (1919) introduced a simpler and more consistent model with a water-to-cement ratio. He suggested that the water-to-cement ratio and the CS of concrete are inversely proportional, meaning that when the ratio decreases, the concrete’s CS increases and vice versa. His model was widely accepted for its straightforward water-to-cement ratio and CS relationship and exhibited significant compatibility with experimental results, particularly for concrete that is not air-entrained. However, the proportionality coefficients in the model, similar to those in Feret’s model, were not universal constants. Instead, they varied depending on factors such as the type of cement used, the concrete’s strength, the testing and curing conditions and the age of the concrete. Without adjustments, this variability restricts the model’s generalizability. However, De Larrard (1999) incorporated aggregates’ properties into the concrete strength prediction. He built his model upon earlier models such as Feret’s and Abrams’ by addressing their limitations and adding new dimensions to concrete strength prediction. De Larrard integrated aggregate properties, such as maximum size and packing density, into a concrete strength model and developed coefficients for different aggregates. This model also included maximum paste thickness (MPT), the distance between aggregates filled by cement pastes in the model, to better understand paste distribution and component interaction within the concrete mix. However, a significant limitation of the model was that it failed to consider the degree of hydration or the chemical and physical characteristics of cement, where both are crucial factors influencing concrete strength. Additionally, the validation of De Larrard’s model was based on a limited data set, which might reduce the reliability and generalizability of the predictions. Afterward, Chidiac et al. (2013), on the other hand, proposed a model integrating the excess paste theory and average paste thickness (APT) along with multiple sub-models to provide a comprehensive approach to predict concrete CS. It unified factors such as cement hydration, bond strength and aggregate gradation, offering realistic and practical strength predictions. However, the model’s validation was limited to a small sample size and a constant range of water-cement ratios, potentially reducing its applicability to high-strength concrete and other conditions.

However, traditional analytical models have laid the foundation for understanding concrete strength; their limitations necessitate the development of advanced prediction models. These advanced models, driven by data and ML, have the potential for greater accuracy, adaptability and efficiency in predicting and optimizing concrete performance. In recent years, ML-based prediction models have gained mass popularity among the scientific community (Meddage et al., 2022; Meddage et al., 2024). Researchers have been implementing advanced ML models such as SVM, KNN, DT, RF, AdaBoost, CatBoost, XGBoost, LightGBM and Neural Networks for predicting concrete CS for their high prediction accuracy and generalization capabilities. Unlike traditional analytical models, data-driven ML-based predictions can predict CS from large-scale data sets with complex mix designs for special types of concretes such as fiber reinforced, recycled aggregate and self-consolidating concrete (Nithurshan and Elakneswaran, 2023). Several researchers have developed data-driven ML models to predict different kinds of concrete over the years. For instance, Yu et al. (2018) used SVM for high-performance concrete (HPC) with 1,761 data points. He found that SVM can better predict the CS of HPC with advanced mix constituents such as blast furnace slag, fly ash and superplasticizer. Chopra et al. (2018) used RF, DT and neural network (NN) model for CS prediction of plain concrete in different aging conditions. They concluded that NN was the most feasible model, having higher accuracy. Feng et al. (2020) took quantitative amounts of aggregate, cement, water and additives into account as input parameters and found that the AB approach was superior to NN and SVM (Feng et al., 2020). Unlike plain concrete, it has been demonstrated that recycled aggregate concrete (RAC), fiber-based concrete or other special purpose concrete inherit more heterogenicity and require complex mix design (Nithurshan and Elakneswaran, 2023). These complex mix designs and heterogeneity could be the reason for multicollinearity (two or more input variables are highly correlated) among input parameters, leading to more complexities in prediction. But in recent days, few researchers have tried and successfully overcome these complexities and developed several prediction models accounting for RAC and fiber-based concrete. Deng et al. (2018) used deep learning to develop a SoftMax regression model using recycled aggregate replacement ratio, water-to-cement ratio and fly ash content as input parameters. They concluded that their model demonstrated high accuracy and strong generalization capacity. A study of Meddage et al. (2024) reported higher accuracy of XGB model with 0.981 of $R^{2}$ score for concrete with graphene oxide. Sabău and Remolina Duran (2022) used multiple linear regression and found a good correlation for predicting CS of recycled aggregate concrete with different aging conditions. Dabiri et al. (2022a) developed linear, nonlinear and RF regression models and compared them with ANN model while he found RF as the most accurate model. Based on the findings of his model, he deduced that incorporating RAC causes concrete’s CS to be decreased.

However, it has been demonstrated that incorporating fiber with plain as well as RAC can improve crack patterns, flexural toughness and CS. A few researchers have also developed several models for fiber-based concrete. For instance, Chen et al. (2021) found the CNN as an accurate flexible model to predict the CS of fiber-reinforced concrete at elevated temperatures and proposed this model as an appropriate mix design optimization tool. Huang et al. (2021) applied CNN to predict the CS of basalt FRRAC, while the model exhibited a maximum 3% residual error distribution. Pal et al. (2023) applied different ML models on fiber-reinforced concrete containing waste rubber and recycled aggregate and they found CatBoost to be the most efficient model.

In sum, most of the existing studies developed efficient models for predicting the CS of advanced types of concrete such as RAC, fiber-reinforced concrete and FRRAC. However, most of these models were developed using a single type of fiber and lacked essential input parameters, such as the water absorption of recycled aggregate. Therefore, this study comprehensively evaluated several predictive models for HFRRAC, incorporating critical input parameters such as recycled aggregate and fiber dosage, fiber types, water absorption of recycled aggregate, water-to-binder ratio and aggregate proportions to identify the most efficient model for optimizing mix design.

2. Materials and method

2.1 Dataset development

A total of 634 concrete CS test results were retrieved from 37 peer-reviewed experimental studies (Gao et al., 1997; Nataraja et al., 1999; Thomas and Ramaswamy, 2007; Bhikshma and Manipal, 2012; Chen et al., 2014; Akça et al., 2015; Zakaria et al., 2015; Mohammad et al., 2016; Senaratne et al., 2016; Afroughsabet et al., 2017; Abbass et al., 2018; Das et al., 2018; Hanumesh et al., 2018; Islam and Ahmed, 2018; Lima et al., 2018, 2019; Aslani et al., 2019; Lee, 2019; Matar and Assaad, 2019; Matar and Zéhil, 2019; Nitesh et al., 2019; Ramesh et al., 2019; Ahmed et al., 2020; He et al., 2020; Zhang et al., 2020; Zhou et al., 2020; Ahmad et al., 2021; Bheel et al., 2021; Gao and Wang, 2021; Gao et al., 2021; Kachouh et al., 2021; Ren et al., 2021; Ali et al., 2022; El Ouni et al., 2022; Zhang et al., 2022a, 2022b; Vijayan et al., 2022; Gong et al., 2023). The CS data involve different concrete mixes with important characteristics such as water to binder ratio (w/b), curing days, recycled aggregate replacement rate (RA), percentage of fiber, fiber type, coarse aggregate to binder ratio (CA/b), fine aggregate to binder ratio (FA/b) and water absorption rate of recycled aggregate as input and compressive strength (CS) as the desired output feature. The data set comprises a total of five distinct types of fiber, namely, steel, polypropylene, jute, nylon and sisal. Additionally, two types of composite fibers were also included; they were steel and polypropylene and nylon and jute. While there were some supplementary features in the data set development phase, only highly influential features with sufficient experimental data were taken into account. Owing to insufficient matching features and inadequate information, several test results were discarded. For repeated tests, the average value of CS was taken into account. The data retrieval procedure faced notable obstacles, particularly in dealing with missing data. To encounter these challenges, the missing data were imputed using the combination of the Gaussian mixture model (GMM) and KNN imputation method. This method first used GMM imputation to estimate missing values, which were then refined by KNN imputation.

2.1.1 Statistical analysis of the data set

A statistical summary of the parameters in the data set is provided in Table 1. The water-to-binder ratio (w/b) varies from 0.2 to 0.66, with a mean value of 0.42 and a standard deviation of 0.1. Notably, 75% of the observations for w/b are 0.5 or lower. The curing age of the concrete ranges from a minimum of 7 days to a maximum of 365 days. However, 75% of the test observations used a curing period of 28 days, resulting in an average curing age of 32.62 days. Table 1 also indicates that the average recycled aggregate replacement rate is 38.84%. The replacement rate ranges from 0%, indicating the exclusive use of NA, to 100%, indicating a complete substitution of NA with RA. The standard deviation for the RA replacement rate is 43.64.

Table 1

Statistical description of features in HFRRAC mix

Parameters	W/b	Curing	RA (%)	Fiber (%)	CA/b	FA/b	WA (%)	CS (MPa)
Count	634	634	634	634	634	634	634	634
Mean	0.42	32.62	38.84	0.66	2.16	1.65	2.05	44.42
SD	0.1	35.54	43.64	0.76	0.68	0.69	2.79	22.05
Min	0.2	7	0	0	0	0.29	0	11
Max	0.66	365	100	6	4.18	4	10.29	132
25th percentile	0.35	28	0	0	1.7	1.3	0	27
50th percentile	0.42	28	0	0.5	2.22	1.62	0	38.75
75th percentile	0.5	28	100	1	2.6	1.76	4.83	59.75

Parameters	W/b	Curing	RA (%)	Fiber (%)	CA/b	FA/b	WA (%)	CS (MPa)
Count	634	634	634	634	634	634	634	634
Mean	0.42	32.62	38.84	0.66	2.16	1.65	2.05	44.42
SD	0.1	35.54	43.64	0.76	0.68	0.69	2.79	22.05
Min	0.2	7	0	0	0	0.29	0	11
Max	0.66	365	100	6	4.18	4	10.29	132
25th percentile	0.35	28	0	0	1.7	1.3	0	27
50th percentile	0.42	28	0	0.5	2.22	1.62	0	38.75
75th percentile	0.5	28	100	1	2.6	1.76	4.83	59.75

Source(s): Authors’ own work

In a similar vein, it is observed that 25% of the test observations lack any form of fiber reinforcement. However, on average, each observation exhibits a fiber content of 0.66%, with a standard deviation of 0.76. The data set encompasses a mean ratio of 1.65:2.16 from fine aggregate to coarse aggregate, with a standard deviation of 0.68 for the coarse aggregate to binder ratio and 0.69 for the fine aggregate to binder ratio. Again, 75% of the water absorption rate of recycled aggregate data was below 4.83 with a standard deviation of 2.79. The designated output, CS, exhibits a minimum value of 11 MPa and a maximum of 132 MPa. The mean CS value is 44.42 MPa, with 75% of the values falling below 59.75 MPa.

The distribution of CS (MPa) with the input features has been depicted in Figure 1. The output feature, CS (MPa), can be seen as a function of the input features and the range of CS (MPa) variation for each input feature value is also presented.

Figure 1

Eight scatter plots compare compressive strength in megapascals with factors such as water to binder ratio, curing time, recycled aggregate percentage, fiber percentage, fiber type, coarse aggregate to binder ratio, fly ash to binder ratio, and water absorption percentage for different fiber categories.

View large Download slide

The figure consists of eight scatter plots examining the relationship between compressive strength in megapascals and various parameters for five fiber categories: no fiber, steel, polypropylene, other fibers, and jute. The first plot shows compressive strength decreasing with increasing water to binder ratio. The second plot displays varying strengths against curing time, with most data concentrated at shorter durations. The third shows a scattered distribution with recycled aggregate percentage. The fourth presents fiber percentage influence, with scattered trends across types. The fifth compares fiber types directly, showing distinct vertical groupings. The sixth relates coarse aggregate to binder ratio to strength, showing no clear pattern. The seventh explores fly ash to binder ratio effects, with overlapping distributions. The eighth presents water absorption percentage versus strength, showing an inverse tendency. The arrangement reveals how each parameter interacts with compressive strength across fiber categories.

Input feature distribution of targeted output CS (MPa)

Source(s): Authors’ own work

The distribution of fibers present in the HFRRAC mixes has been depicted in Figure 2. The data set shows that 28% do not include any fiber. Steel fiber stands out as the most prevalent among the various types of fibers, accounting for 27% of the total observations. The sisal fiber exhibits the lowest count, with 26 test observations among the single fibers. Additionally, two distinct categories of hybrid fibers exist (simultaneously mixing two single fibers), namely, 1% steel and polypropylene and 1% nylon and jute, which are the least prevalent.

Figure 2

A pie chart displaying the distribution of different fiber types, showing percentages for categories such as No fiber, Jute, and Steel.

View large Download slide

The image shows a pie chart presenting the distribution of various fiber types with corresponding percentages. The chart is divided into segments representing No fiber at twenty-one percent, Jute at twenty-seven percent, Steel at thirteen percent, Nylon at six percent, Nylon and Jute at six percent, Steel and Polypropylene at twenty-seven percent, Polypropylene at four percent, and Sisal at one percent. Each category is colour-coded and identified in a legend below, ensuring clarity in distinguishing between the fiber types represented.

Distribution of input fiber types in HFRRAC mix data set

Source(s): Authors’ own work

2.1.2 Correlation among the features

Understanding the correlation between input and output features is crucial, as it helps to understand how variables interact and influence each other to propose a prediction model. Other than influencing the output feature only, some input features may be interdependent. That interdependency among input variables with a high correlation coefficient may lead to multicollinearity, which can cause inefficiency and instability of prediction models.

The statistical metric known as the correlation coefficient (R) is used to quantify how much one variable change in relation to another. Equation 1 guided the Pearson correlation coefficient as the covariance between two variables (⁠ $C O V_{x y}$ ⁠) divided by their standard deviation (⁠ $σ_{x}, σ_{y}$ ⁠). R has a limit of $- 1 \leq R \leq + 1$ ⁠, where the R value close to $- 1$ or $+ 1$ are the stronger linear relationship between two variables:

R_{x y} = \frac{(C O V_{x y})}{σ_{x} σ_{y}}

(1)

Table 2 presents the determined Pearson correlation coefficient from equation (1) among the input and output features. The results depict that the curing age (⁠ $R = + 0.166$ ⁠) and water to cement ratio (⁠ $R = - 0.814$ ⁠) has maximum influence on CS where CS increases in increment of curing age and CS decreases in increment of w/c ratios. Similar results of dependency of CS on these features are also reported by (Poon et al., 2004; Kisku et al., 2017; Pourbaba et al., 2018; Zheng et al., 2018; Sultana et al., 2020; Kang, Yoo and Gupta, 2021; Dabiri et al., 2022b; Zhang et al., 2022a, 2022b; Pakzad et al., 2023).

Table 2

Pearson correlation coefficient of the input and output features

Features	Input features							Output features
Features	w/b	Curing	RA (%)	Fiber (%)	CA/b	FA/b	WA (%)	CS (MPa)
Input features
w/b	1	0.035	−0.054	−0.139	0.524	0.428	−0.034	−0.814
Curing	0.035	1	0.02	−0.015	0.005	0.016	−0.037	0.166
RA (%)	−0.054	0.02	1	−0.085	−0.065	−0.093	0.642	−0.06
Fiber (%)	−0.139	−0.015	−0.085	1	−0.192	−0.164	−0.111	0.156
CA/b	0.524	0.005	−0.065	−0.192	1	0.192	0.006	−0.323
FA/b	0.428	0.016	−0.093	−0.164	0.192	1	−0.116	−0.329
WA (%)	−0.034	−0.037	0.642	−0.111	0.006	−0.116	1	−0.006
Output features
CS (MPa)	−0.814	0.166	−0.06	0.156	−0.323	−0.329	−0.006	1

Features	Input features							Output features
Features	w/b	Curing	RA (%)	Fiber (%)	CA/b	FA/b	WA (%)	CS (MPa)
Input features
w/b	1	0.035	−0.054	−0.139	0.524	0.428	−0.034	−0.814
Curing	0.035	1	0.02	−0.015	0.005	0.016	−0.037	0.166
RA (%)	−0.054	0.02	1	−0.085	−0.065	−0.093	0.642	−0.06
Fiber (%)	−0.139	−0.015	−0.085	1	−0.192	−0.164	−0.111	0.156
CA/b	0.524	0.005	−0.065	−0.192	1	0.192	0.006	−0.323
FA/b	0.428	0.016	−0.093	−0.164	0.192	1	−0.116	−0.329
WA (%)	−0.034	−0.037	0.642	−0.111	0.006	−0.116	1	−0.006
Output features
CS (MPa)	−0.814	0.166	−0.06	0.156	−0.323	−0.329	−0.006	1

Source(s): Authors’ own work

2.2 Preprocessing of data

Data preprocessing is a task that transforms unclean data into a format that’s suitable for training a ML model. Data from diverse sources often remains in its raw form, posing difficulties in syncing the data with ML models. Unclean and raw data may possess features that are measured in various units, have skewness or non-normality in their distribution and are in an unstructured format, specifically consisting of categorical values. Preprocessing plays a crucial role in transforming raw data into a structured format suitable for the learning and understanding capabilities of ML models. The utilization of preprocessing approaches to transform categorical values into numerical values and standardize data into a similar range has been found to enhance the model’s overall performance.

2.2.1 Categorical encoding

Categorical data is a type of data comprising non-numerical values, such as textual data. ML algorithms are usually outperformed when applied to numerical data, hence categorical data must be converted into numerical format before it is synced into the models, which is commonly referred to as categorical encoding. In this study, only one feature named fiber type contains 5 distinct fiber categories in 634 observations. The feature was encoded using the OneHotEncoding method using the Scikit-learn module in Python. A binary column was created for each category in the original categorical variable. Each subsequent column denotes a distinct category, where a value of 1 denotes that the data point is associated with that category, whereas a value of 0 denotes otherwise. In the case of steel fiber, a number of 1 denotes the fiber’s presence and a value of 0 denotes its absence.

2.2.2 Data set standardization

The analysis of Table 1 shows that the features have varying scales of numerical values, potentially leading to biased outcomes during the training process. Certain features may have a dominant influence on the learning process due to their larger scale. The dominating effect is mitigated through the process of data standardization, which involves bringing all features to a similar scale. A data standardization method called Z-score normalization assigns a mean of 0 and a standard deviation of 1 to the data set. In equation (2), the Z-score normalization technique is presented where scaled data $z$ is the difference between the original feature value (⁠ $x$ ⁠) and the mean (⁠ $μ$ ⁠) of the feature values divided by the standard deviation (⁠ $σ$ ⁠) of the feature values in the data set:

z = \frac{x - μ}{σ}

(2)

2.3 Model validation and performance metrics

A total of 11 different ML models were implemented in this study. The algorithms used in this study are Linear Regression, Ridge Regression, Lasso Regression, SVM, K-Nearest Neighbor, DT, AdaBoost, RF, CatBoost, XGBoost and LGBM. Model validation is a technique that ensures models perform well on unseen data, avoiding overfitting. It comprises two main steps: data set splitting and cross-validation. The data set was split into 80% for training and 20% for test sets. The training data set is used to train the model and the test data set evaluates its performance against true values.

Performance metrics are quantitative measures used to evaluate the performance of a model on a given data set. These metrics evaluate the model’s performance in terms of its predictive accuracy and generalization ability. Several performance metrics were used in this study to evaluate model performance such as adjusted $R^{2}$ ⁠, mean squared error (MSE), mean absolute error (MAE) and Root MSE (RMSE), which were also used in various studies for concrete strength prediction (Yuan et al., 2014; Getahun et al., 2018; Yu et al., 2018; Kaloop et al., 2020; Salami et al., 2021; Ahmad et al., 2022; Liu, 2022).

$R^{2}$ is a statistical metric that shows how much of the variance in the dependent variable can be accounted for by the independent variables. A higher $R^{2}$ value, which ranges from 0 to 1, denotes a better model fit to the data. A modified form of $R^{2}$ ⁠, known as adjusted $R^{2}$ ⁠, penalizes the inclusion of redundant predictors by adjusting for both the number of predictors (⁠ $p$ ⁠) and number of observations (⁠ $n$ ⁠) in the model. As a result, it becomes a more reliable metric when contrasting models with different numbers of predictors. This study reported only adjusted $R^{2}$ for the remaining sections. However, MSE calculates the average of the squared differences between predicted (⁠ $\hat{y}$ ⁠) and actual values (⁠ $y$ ⁠), providing a measure of prediction accuracy. It is sensitive to outliers due to the squaring of errors. The residual standard deviation is measured by RMSE, which is the square root of MSE. In contrast, MAE provides a more intuitive explanation of prediction error by calculating the average absolute differences between the values that were predicted and the actual values. The mathematical equations of these performance metrics are presented in equations (3)–(7):

R^{2} = 1 - \frac{\sum_{i}^{n} {(y_{i} - \hat{y_{i}})}^{2}}{\sum_{i}^{n} {(y_{i} - \bar{y_{i}})}^{2}}

(3)

A d j R^{2} = 1 - \frac{(1 - R^{2}) (n - 1)}{n - p - 1}

(4)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}

(5)

M A E = \frac{1}{n} | y_{i} - \hat{y_{i}} |

(6)

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}

(7)

2.4 Hyperparameter tuning

Using GridSearchCV techniques, the best set of hyperparameters was found to maximize the model’s performance. Grid search involves specifying a grid of hyperparameter values to explore and for each combination of the grid, 5-fold cross-validation was performed. With this method, the training data were split into 5 equal folds. The models were trained on 4 of 5 folds of the data and validated on the remaining fold, with the process rotated through all folds and repeated with all of the combinations of the hyperparameter grid. Aggregating the results from the cross-validation, an optimal combination of hyperparameters was found, and then the final models were trained with those hyperparameters. All the hyperparameters used in each model are presented in Table 3.

Table 3

Implemented algorithms and their hyperparameter values

No.	Models	Parameters	Values	Standard range
1	RR	alpha	10.0	0.1–100 (log scale)
		fit_intercept	True	Boolean (true/false)
		solver	Sparse_cg	{“auto”, “svd”, “cholesky”, “lsqr”, “sparse_cg”, “sag”, “saga”}
2	Lasso	alpha	0.1	0.0001–1 (log scale)
		fit_intercept	True	Boolean (true/false)
		selection	random	{“cyclic”, “random”}
3	SVM	c	100	0.1–1000 (log scale)
		gamma	scale	{“scale”, “auto”} or float (0.001–10)
		kernel	rbf	{“linear”, “poly”, “rbf”, “sigmoid”}
4	KNN	algorithm	brute	{“auto”, “ball_tree”, “kd_tree”, “brute”}
		leaf_size	10	1–100
		metric	Manhattan	{“euclidean”, “manhattan”, “minkowski”}
		n_neighbors	3	1–20
		p	1	1 (manhattan), 2 (euclidean)
		weights	distance	{“uniform”, “distance”}
5	DT	criterion	squared_error	{“squared_error”, “friedman_mse”, “absolute_error”}
		max_depth	20	1–100
		max_features	sqrt	{“sqrt”, “log2”, None} or int/float
		max_leaf_nodes	None	2–infinity or None
		min_sample_leaf	1	1–20
		min_sample_split	2	2–20
		random_state	42	Fixed seed
		splitter	best	{“best”, “random”}
6	RF	max_depth	30	1–100
		max_features	sqrt	{“sqrt”, “log2”, None} or int/float
		min_sample_leaf	1	1–20
		min_sample_split	2	2–20
		n_estimator	200	10–1,000
7	AB	learning_rate	1.0	0.01–1
		n_estimator	50	10–1,000
		random_state	72	Fixed seed
8	XGB	colsample_bytree	0.4	0.1–1
		gamma	0.1	0–infinity (typically 0–5)
		learning_rate	0.15	0.001–0.3
		max_depth	6	1–20
		min_child_weight	1	0–infinity (typically 1–10)
9	CB	learning_rate	0.3	0.001–0.3
		n_estimator	100	10–1,000
		random_state	50	Fixed seed
10	LGBM	learning_rate	0.1	0.001–0.3
		max_depth	5	1–20
		min_data_in_leaf	20	1–100
		n_estimator	450	10–1,000

No.	Models	Parameters	Values	Standard range
1	RR	alpha	10.0	0.1–100 (log scale)
		fit_intercept	True	Boolean (true/false)
		solver	Sparse_cg	{“auto”, “svd”, “cholesky”, “lsqr”, “sparse_cg”, “sag”, “saga”}
2	Lasso	alpha	0.1	0.0001–1 (log scale)
		fit_intercept	True	Boolean (true/false)
		selection	random	{“cyclic”, “random”}
3	SVM	c	100	0.1–1000 (log scale)
		gamma	scale	{“scale”, “auto”} or float (0.001–10)
		kernel	rbf	{“linear”, “poly”, “rbf”, “sigmoid”}
4	KNN	algorithm	brute	{“auto”, “ball_tree”, “kd_tree”, “brute”}
		leaf_size	10	1–100
		metric	Manhattan	{“euclidean”, “manhattan”, “minkowski”}
		n_neighbors	3	1–20
		p	1	1 (manhattan), 2 (euclidean)
		weights	distance	{“uniform”, “distance”}
5	DT	criterion	squared_error	{“squared_error”, “friedman_mse”, “absolute_error”}
		max_depth	20	1–100
		max_features	sqrt	{“sqrt”, “log2”, None} or int/float
		max_leaf_nodes	None	2–infinity or None
		min_sample_leaf	1	1–20
		min_sample_split	2	2–20
		random_state	42	Fixed seed
		splitter	best	{“best”, “random”}
6	RF	max_depth	30	1–100
		max_features	sqrt	{“sqrt”, “log2”, None} or int/float
		min_sample_leaf	1	1–20
		min_sample_split	2	2–20
		n_estimator	200	10–1,000
7	AB	learning_rate	1.0	0.01–1
		n_estimator	50	10–1,000
		random_state	72	Fixed seed
8	XGB	colsample_bytree	0.4	0.1–1
		gamma	0.1	0–infinity (typically 0–5)
		learning_rate	0.15	0.001–0.3
		max_depth	6	1–20
		min_child_weight	1	0–infinity (typically 1–10)
9	CB	learning_rate	0.3	0.001–0.3
		n_estimator	100	10–1,000
		random_state	50	Fixed seed
10	LGBM	learning_rate	0.1	0.001–0.3
		max_depth	5	1–20
		min_data_in_leaf	20	1–100
		n_estimator	450	10–1,000

Source(s): Authors’ own work

3. Result and discussion

3.1 Model performance

This section presents the performance metrics of 11 implemented models for the CS prediction of HFRRAC. Table 4 depicts the performance metrics of the implemented models for both the train and test phase scores. As seen, tree-based ensemble models such LGBM, CB, XGB and RF have superior performance among all other models. LGBM performed the best in both the train and test phases with Adj. $R^{2}$ score of 0.988 and 0.969. It also exhibits better performance against overfitting as it poses a lower difference between train-test scores. LGBM efficiently captured the nonlinear nature of HFRRAC data as it uses a learning rate to control the contribution of each tree, balancing the bias-variance tradeoff and improving generalization, which provides lower MSE, MAE and RMSE scores of 14.04, 2.76 and 3.75. It is worth mentioning that other ensemble models, such as CB, XGB and RF can also handle complex and nonlinear feature interaction of data with higher Adj. $R^{2}$ score of 0.967, 0.966 & 0.966 in the test phase.

Table 4

Performance metrics of each model in both train and test phases

Model	Phase	Adj $R^{2}$	MSE	MAE	RMSE
LR	Train	0.753	114.88	8.1	10.74
LR	Test	0.72	126.19	8.55	11.23
RR	Train	0.754	114.84	8.11	10.72
RR	Test	0.721	125.91	8.53	11.22
Lasso	Train	0.755	114.87	8.12	10.73
Lasso	Test	0.725	123.97	8.54	11.13
SVM	Train	0.948	33.5	3.17	5.79
SVM	Test	0.936	28.77	3.61	5.36
KNN	Train	0.973	12.74	1.28	3.57
KNN	Test	0.936	29.05	3.86	5.39
DT	Train	0.962	31.73	3.46	5.63
DT	Test	0.958	18.87	3.14	4.34
RF	Train	0.98	9.09	1.83	3.01
RF	Test	0.966	15.5	2.86	3.94
AB	Train	0.865	63.11	6.65	7.94
AB	Test	0.871	58.08	6.26	7.62
XGB	Train	0.982	8.42	1.63	2.9
XGB	Test	0.966	15.46	2.71	3.93
CB	Train	0.986	6.37	1.35	2.52
CB	Test	0.967	14.87	2.64	3.86
LGBM	Train	0.988	5.78	1.7	2.4
LGBM	Test	0.969	14.04	2.76	3.75

Model	Phase	Adj $R^{2}$	MSE	MAE	RMSE
LR	Train	0.753	114.88	8.1	10.74
LR	Test	0.72	126.19	8.55	11.23
RR	Train	0.754	114.84	8.11	10.72
RR	Test	0.721	125.91	8.53	11.22
Lasso	Train	0.755	114.87	8.12	10.73
Lasso	Test	0.725	123.97	8.54	11.13
SVM	Train	0.948	33.5	3.17	5.79
SVM	Test	0.936	28.77	3.61	5.36
KNN	Train	0.973	12.74	1.28	3.57
KNN	Test	0.936	29.05	3.86	5.39
DT	Train	0.962	31.73	3.46	5.63
DT	Test	0.958	18.87	3.14	4.34
RF	Train	0.98	9.09	1.83	3.01
RF	Test	0.966	15.5	2.86	3.94
AB	Train	0.865	63.11	6.65	7.94
AB	Test	0.871	58.08	6.26	7.62
XGB	Train	0.982	8.42	1.63	2.9
XGB	Test	0.966	15.46	2.71	3.93
CB	Train	0.986	6.37	1.35	2.52
CB	Test	0.967	14.87	2.64	3.86
LGBM	Train	0.988	5.78	1.7	2.4
LGBM	Test	0.969	14.04	2.76	3.75

Source(s): Authors’ own work

In contrast, the simplest model, LR, RR and Lasso perform poorly in the train (Adj. $R^{2}$ = 0.753, 0.754, 0.755) and test (Adj. $R^{2}$ = 0.72, 0.721, 0.725) phase. These simplest models cannot capture the complex pattern of HFRRAC’s nonlinear data. These models are inherently built with a simple linear relationship between input and output features and are incompetent to handle the high dimensionality of complex data. Several studies also reported the inefficiency of these linear models with low $R^{2}$ and high MSE scores (Khademi et al., 2016; Kang et al., 2021; Patil et al., 2023; Bansal et al., 2024).

3.1.1 Rank-wise analysis

Implemented models were further subjected to a comparative assessment using a method called ‘score analyses’, as employed by Pal et al. (2023). According to this approach, each model was scored on a scale of 1 to N (where n = 11, the total number of implemented models) in both the test and train phases. The model that performed the best was given a score of 11, while the model that did the poorest scored 1. The total score was determined by aggregating the performance metrics attained by each model separately for both the test and train phases. The final score of a model was obtained by summing the test and train scores. The model with the highest final score was then assigned the final rank of 1, while the model with the lowest score was assigned 11. Table 5 demonstrates that LGBM achieved outstanding performance in both the test and train phases, securing the top position with a total score of 83. Besides LGBM, CB achieved close scores of 41 and 40 during the test and train phase, positioning it as the second top performer with a total final score of 81. LR has the lowest rank of 11, making it the model with the poorest performance. LGBM is the highest performing model, followed by CB, XGB and RF, which are in second, third and fourth place, respectively. This ranking could be considered pivotal for mix design, quality control and structural safety, for selecting efficient models for reliable CS prediction in HFRRAC while reducing dependence on physical testing.

Table 5

Rank-wise score analysis

Models	Phase	Adj $R^{2}$	MSE	MAE	RMSE	Total score	Final score	Final rank
LR	Train	1	1	3	1	6	10	11
LR	Test	1	1	1	1	4	10	11
RR	Train	2	3	2	3	10	19	9
RR	Test	2	2	3	2	9	19	9
Lasso	Train	3	2	1	2	8	19	9
Lasso	Test	3	3	2	3	11	19	9
SVM	Train	5	5	6	5	21	44	7
SVM	Test	5	6	6	6	23	44	7
KNN	Train	7	7	11	7	32	52	5
KNN	Test	5	5	5	5	20	52	5
DT	Train	6	6	5	6	23	51	6
DT	Test	7	7	7	7	28	51	6
RF	Train	8	8	7	8	31	63	4
RF	Test	8	8	8	8	32	63	4
AB	Train	4	4	4	4	16	32	8
AB	Test	4	4	4	4	16	32	8
XGB	Train	9	9	9	9	36	72	3
XGB	Test	8	9	10	9	36	72	3
CB	Train	10	10	10	10	40	81	2
CB	Test	10	10	11	10	41	81	2
LGBM	Train	11	11	8	11	41	83	1
LGBM	Test	11	11	9	11	42	83	1

Source(s): Authors’ own work

3.2 Rate of residual error

The difference between the actual and predicted CS is often referred as residual error. Minimizing residual error in CS prediction is critical for enhancing safety margins, particularly for recycled concrete, where variability because of material source and quality is common. This section only considers residual errors in test phase. However, the distribution of residual error in terms of interquartile range (IQR) for each model is illustrated in Figure 3. It is seen that XGB has the lowest IQR, followed by CB, LGBM and RF. On the other hand, LR, RR and Lasso have the highest IQR, indicating that they are the least effective models. While LGBM has a slightly higher IQR compared to XGB and CB, it also has the lowest number of outliers, making it a more significant and reliable performance. Figure 4 illustrates the kernel density estimation, which depicts the normal distribution of the data for the models. The model with the lowest skewness and higher peak closer to zero exhibits a higher degree of normality. Though CB and XGB had higher peak closer to zero, LGBM exhibits a lower degree of skewness, indicating a more normal distribution. Hence, adopting such models can enhance the precision of performance-based concrete design in sustainable construction.

Figure 3

A box plot compares residual errors for Linear Regression, Ridge Regression, Lasso Regression, Support Vector Regression, K-Nearest Neighbors, Decision Trees, Random Forests, AdaBoost, Extreme Gradient Boosting, CatBoost, and LightGBM models.

View large Download slide

he box plot displays residual error distributions for eleven models: Linear Regression, Ridge Regression, Lasso Regression, Support Vector Regression, K-Nearest Neighbors, Decision Trees, Random Forests, AdaBoost, Extreme Gradient Boosting, CatBoost, and LightGBM. The horizontal axis lists the models, while the vertical axis represents residual error values. Each box indicates the interquartile range, with horizontal lines showing medians and whiskers extending to minimum and maximum values excluding outliers. Outliers are plotted as individual points. Variations in box heights and median positions highlight differences in error spread and central tendencies across the models.

Distribution of residual error for each model

Source(s): Authors’ own work

Figure 4

A density plot compares residual error distributions for eleven models, including Linear Regression, Ridge Regression, Lasso, Support Vector Regression, K-Nearest Neighbors, Decision Trees, Random Forests, AdaBoost, XGBoost, CatBoost, and LightGBM.

View large Download slide

The density plot presents residual error distributions for Linear Regression, Ridge Regression, Lasso, Support Vector Regression, K-Nearest Neighbors, Decision Trees, Random Forests, AdaBoost, X G Boost, CatBoost, and Light G B M. The horizontal axis ranges from negative 40 to positive 40, showing residual error values, while the vertical axis represents density up to approximately 0.14. Each model is depicted with a distinct curve, some of which overlap, reflecting similarities in error distributions. The legend associates each curve with its respective model, enabling clear comparison of performance patterns and variations in error spread across the models.

Kernel density estimation of residual error

Source(s): Authors’ own work

3.3 Sensitivity analysis

Sensitivity analysis demonstrates the absolute importance of the input feature influencing the CS. Figure 5 depicted the feature importance of input variables that impact CS of HFRRAC for SVR, KNN, DT, RF, AB, XGB, CB and LGBM. Absolute importance is given in the range of 0%–100%. The variation in feature importance across models is attributed to differences in their internal architecture and learning techniques. Despite these differences, it is clearly seen that among different input features w/b are the most influential features for CS of HFRRAC in most of the models. This finding aligned with fundamental concrete mix design principles where the increase of w/b significantly decreases the CS of concrete (Aïtcin, 2016; Kaplan et al., 2022; Kaur et al., 2023). However, FA/b, CA/b, curing, RA (%) and fiber (%) also have significant importance among the models. Steel fiber demonstrated the highest influence among other fibers in our models. This result is consistent with its theoretical advantages, such as the superior tensile strength and stiffness for crack bridging (Kang et al., 2017), and the synergistic effects in hybrid fiber, where steel fibers dominate macro-crack resistance while other fibers (e.g., polypropylene) control micro-cracking (He et al., 2020).

Figure 5

A set of bar charts shows percentage importance of variables for six models: Decision Tree, Random Forest, AdaBoost, XGBoost, CatBoost, and LightGBM, with water-to-binder ratio often having the highest importance.

View large Download slide

The figure contains six bar charts, each representing the percentage importance of input variables for Decision Tree, Random Forest, AdaBoost, X G Boost, C a t Boost, and Light G B M models. Variables include water-to-binder ratio, fine aggregate-to-binder ratio, coarse aggregate-to-binder ratio, curing time, recycled aggregate percentage, fibre percentage, and steel fibre. Across most models, water-to-binder ratio shows the highest importance, particularly in Decision Tree, Random Forest, and A d a Boost. Light G B M displays a more balanced importance distribution among variables, while C a t Boost and X G Boost also rank water-to-binder ratio highest but with notable contributions from fine aggregate-to-binder ratio and curing time.

Feature importance of input variables influencing CS for each model

Source(s): Authors’ own work

3.4 Model comparison and statistical scores

Figure 6 compares each implemented model by true vs predicted CS plot of HFRRAC, whereas Figure 7 presents statistical scores of models. The slope of the experimental vs predicted values is denoted as “m”. The slope, which shows how much the prediction accuracy is steepened, is the change in the predicted CS for a one-unit increase in the true CS. The model’s precision is indicated by a value of m that is closer to 1. Underprediction is indicated by a m value less than 1, and overprediction is shown by a m value greater than 1. The data indicates that CB outperformed other models in terms of m value, with the highest m value of 0.962 where LGBM had the closest value of 0.961.

Figure 6

A set of scatter plots compares experimental and predicted values for eleven models, showing trend lines close to the diagonal line, indicating high predictive accuracy for most models.

View large Download slide

The figure presents scatter plots for eleven models: Linear Regression, Ridge Regression, Lasso Regression, Support Vector Regression, K Nearest Neighbors, Decision Tree, Random Forest, Ada Boost, X G Boost, Cat Boost and Light G B M, comparing experimental and predicted values. Each plot has experimental values on the horizontal axis and predicted values on the vertical axis, with a diagonal reference line indicating perfect prediction. Trend lines are fitted to the data points, most of which closely follow the diagonal, particularly for tree based ensemble models such as Random Forest, X G Boost, Cat Boost and Light G B M. This close alignment indicates high predictive accuracy, with smaller deviations for these models compared to linear regression based approaches.

Scattered plot for actual vs predicted CS for each model

Source(s): Authors’ own work

Figure 7

Three radar charts compare the performance of eleven machine learning models, with panels (a), (b) and (c) showing overlapping polygon shapes representing performance metrics.

View large Download slide

The figure presents three radar charts labelled (a), (b) and (c), each comparing the performance of eleven machine learning models: L R, Light G B M, Cat Boost, X G Boost, Ada Boost, Random Forest, Decision Tree, K Nearest Neighbours, Support Vector Regression, Lasso and Ridge Regression. In each chart, polygons represent the performance metrics for the models, overlapping in different configurations. Panels (a) and (b) have y axis values ranging from 0 to 1, while panel (c) ranges from 0 to 0.4. Concentric grid lines provide reference levels for performance, and shaded areas within polygons indicate the extent of each model’s performance.

Radar chart for statistical score of each model (a) slope, (b) correlation coefficient, (c) standard error

Source(s): Authors’ own work

The strength of the linear relationship or how well the predicted values match the true values is indicated by the correlation coefficient or R. A strong relationship between the predicted and true values is indicated by a high positive R-value that is close to +1. Conversely, a low R-value, approaching zero, reveals a minimal or nonexistent association between the true and predicted values. Regarding R values, LGBM had superior performance compared to other models with R-value of 0.986, although both the CB and XGB demonstrated closer R values of 0.985.

In contrast, the standard error serves as a metric for quantifying the dispersion of the predicted values around the regression line by showing the standard deviation of residuals. The standard error of the regression provides an estimation of the mean deviation between the predicted values and the true values. A smaller standard error of the regression signifies that the projected values exhibit a higher degree of proximity to the true values, hence indicating a more suitable fit of the regression model to the data. On the other hand, a higher standard error implies increased dispersion in the predicted values around the regression line, indicating a less satisfactory suitability of the model. Figure 7 illustrates that LGBM, followed by XGB and CB, has the lowest standard error, indicating reduced variability and better model fit.

4. Conclusion

This study analyzed 634 comprehensive data sets from various published experimental studies using 11 ML models to predict the CS of FRRAC. The hyperparameters were implemented by the grid search technique, and the models were optimized by a 5-fold cross-validation technique to encounter overfitting issues. The research finally comes to the following key findings:

Among the eleven implemented ML models, LGBM emerged as the top performer, achieving an adjusted $R^{2}$ of 0.969, an MSE of 14.04, an MAE of 2.76 and an RMSE of 3.75. CatBoost followed closely with an adjusted $R^{2}$ of 0.967, an MSE of 14.87, an MAE of 2.64 and an RMSE of 3.86. Additionally, XGBoost emerged as the closest competitor to LGBM and CB.
The least effective models are Linear, Ridge and Lasso Regression, each with an adjusted $R^{2}$ score below 0.80. These models also exhibit high error metrics, with MSE exceeding 110, MAE over 8 and root MSE above 10. Their simplicity prevents them from capturing the complex, non-linear relationships within the data, making them unsuitable for predicting the CS of HFRRAC.
Among all the tree-based ensemble models, AdaBoost is the worst performer and ranked 8th among 11 ML models as it uses weak learners like decision stumps, may struggle with complex feature interactions and risks overfitting. In contrast, other ensemble models capture these interactions more effectively with deeper trees.
According to sensitivity analysis, w/b is the most influential parameter for CS. While w/b increases, CS is found to decrease; the CS increases while the w/b decreases. CA/b, FA/b, RA (%) and curing age were also found to be important parameters for CS.
Steel fiber was found to be the most influential among all other fibers. Higher dosages of some fibers such as polypropylene and nylon led to a reduction of CS. But by using hybrid fiber, these issues have been overcome and had an improvement over the CS of HFRRAC.

This study successfully builds an alternative method of predicting the CS of HFRRAC using a data-driven technique with different machine-learning models. A strong correlation between input and output features and high prediction accuracy assist the civil engineering community in selecting the most optimal mix of design and input attributes for subsequent modeling. However, this study lacks some key parameters, such as the inclination angle and fiber orientation profile of fibers accounting for isotropic effects (influence on mechanical strength) in HFRRAC. Future research could be focused on the inclusion of these parameters in the prediction model.

References

Abbas

,

Y.M.

and

Iqbal Khan

,

M.

(

2016

), “

Fiber–matrix interactions in fiber-reinforced concrete: a review

”,

Arabian Journal for Science and Engineering

, Vol.

41

No.

4

, pp.

1183

-

1198

, doi:

https://doi.org/10.1007/S13369-016-2099-1/METRICS

.

Google Scholar

Abbass

,

W.

,

Khan

,

M.I.

and

Mourad

,

S.

(

2018

), “

Evaluation of mechanical properties of steel fiber reinforced concrete with different strengths of concrete

”,

Construction and Building Materials

, Vol.

168

, pp.

556

-

569

, doi:

https://doi.org/10.1016/j.conbuildmat.2018.02.164

.

Google Scholar

Crossref

Abrams

,

D.

(

1919

),

Design of Concrete Mixtures, Structural Materials Research Laboratory

,

Structural Materials Research Laboratory, Lewis Institute

.

Google Scholar

Afroughsabet

,

V.

,

Biolzi

,

L.

and

Ozbakkaloglu

,

T.

(

2017

), “

Influence of double hooked-end steel fibers and slag on mechanical and durability properties of high performance recycled aggregate concrete

”,

Composite Structures

, Vol.

181

, pp.

273

-

284

, doi:

https://doi.org/10.1016/j.compstruct.2017.08.086

.

Google Scholar

Crossref

Ahmad

,

A.

, et al. (

2022

), “

Compressive strength prediction of fly ash-based geopolymer concrete via advanced machine learning techniques

”,

Case Studies in Construction Materials

, Vol.

16

, p.

e00840

, doi:

https://doi.org/10.1016/J.CSCM.2021.E00840

.

Google Scholar

Crossref

Ahmad

,

J.

, et al. (

2021

), “

Mechanical properties and durability assessment of nylon fiber reinforced self-compacting concrete

”,

Journal of Engineered Fibers and Fabrics

, Vol.

16

, doi:

https://doi.org/10.1177/15589250211062833

.

Google Scholar

Ahmed

,

T.W.

,

Ali

,

A.A.M.

and

Zidan

,

R.S.

(

2020

), “

Properties of high strength polypropylene fiber concrete containing recycled aggregate

”,

Construction and Building Materials

, Vol.

241

, doi:

https://doi.org/10.1016/j.conbuildmat.2020.118010

.

Google Scholar

Aïtcin

,

P.C.

(

2016

), “

The importance of the water–cement and water–binder ratios

”,

Science and Technology of Concrete Admixtures

, pp.

3

-

13

, doi:

https://doi.org/10.1016/B978-0-08-100693-1.00001-1

.

Google Scholar

Akça

,

K.I.R.

,

Çakir

,

Ö.

and

Ipek

,

M.

(

2015

), “

Properties of polypropylene fiber reinforced concrete using recycled aggregates

”,

Construction and Building Materials

, Vol.

98

, pp.

620

-

630

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2015.08.133

.

Google Scholar

Crossref

Alabduljabbar

,

H.

, et al. (

2024

), “

Assessment of the split tensile strength of fiber reinforced recycled aggregate concrete using interpretable approaches with graphical user interface

”,

Materials Today Communications

, Vol.

38

, p.

108009

, doi:

https://doi.org/10.1016/J.MTCOMM.2023.108009

.

Google Scholar

Crossref

Ali

,

B.

, et al. (

2022

), “

Improving the performance of recycled aggregate concrete using nylon waste fibers

”,

Case Studies in Construction Materials

, Vol.

17

, p.

e01468

, doi:

https://doi.org/10.1016/J.CSCM.2022.E01468

.

Google Scholar

Crossref

Asim

,

M.

, et al. (

2020

), “

Comparative experimental investigation of natural fibers reinforced light weight concrete as thermally efficient building materials

”,

Journal of Building Engineering

, Vol.

31

, p.

101411

, doi:

https://doi.org/10.1016/J.JOBE.2020.101411

.

Google Scholar

Crossref

Aslani

,

F.

, et al. (

2019

), “

Experimental analysis of fiber-reinforced recycled aggregate self-compacting concrete using waste recycled concrete aggregates, polypropylene, and steel fibers

”,

Structural Concrete

, Vol.

20

No.

5

, pp.

1670

-

1683

, doi:

https://doi.org/10.1002/suco.201800336

.

Google Scholar

Crossref

Atasham Ul Haq

,

M.

, et al. (

2024

), “

Optimal utilization of low-quality construction waste and industrial byproducts in sustainable recycled concrete

”,

Construction and Building Materials

, Vol.

428

, p.

136362

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2024.136362

.

Google Scholar

Crossref

Bansal

,

T.

,

Talakokula

,

V.

and

Saravanan

,

T.J.

(

2024

), “

Comparative study of machine learning methods to predict compressive strength of high-performance concrete and model validation on experimental data

”,

Asian Journal of Civil Engineering

, Vol.

25

No.

2

, pp.

1195

-

1206

, doi:

https://doi.org/10.1007/s42107-023-00836-6

.

Google Scholar

Crossref

Bheel

,

N.

, et al. (

2021

), “

Experimental study on engineering properties of cement concrete reinforced with nylon and jute fibers

”,

Buildings

, Vol.

11

No.

10

, doi:

https://doi.org/10.3390/buildings11100454

.

Google Scholar

Bhikshma

,

V.

and

Manipal

,

K.

(

2012

), “

Study on mechanical properties of recycled aggregate concrete containing steel fibers

”,

Asian Journal of Civil Engineering (Building and Housing

,

available at:

Link to Study on mechanical properties of recycled aggregate concrete containing steel fibersLink to the cited article

Google Scholar

Carneiro

,

J.A.

, et al. (

2014

), “

Compressive stress–strain behavior of steel fiber reinforced-recycled aggregate concrete

”,

Cement and Concrete Composites

, Vol.

46

, pp.

65

-

72

, doi:

https://doi.org/10.1016/J.CEMCONCOMP.2013.11.006

.

Google Scholar

Crossref

Chen

,

G.M.

, et al. (

2014

), “

Compressive behavior of steel fiber reinforced recycled aggregate concrete after exposure to elevated temperatures

”,

Construction and Building Materials

, Vol.

71

, pp.

1

-

15

, doi:

https://doi.org/10.1016/j.conbuildmat.2014.08.012

.

Google Scholar

Crossref

Chen

,

H.

,

Yang

,

J.

and

Chen

,

X.

(

2021

), “

A convolution-based deep learning approach for estimating compressive strength of fiber reinforced concrete at elevated temperatures

”,

Construction and Building Materials

, Vol.

313

, p.

125437

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2021.125437

.

Google Scholar

Crossref

Chen

,

K.

, et al. (

2021

), “

Critical evaluation of construction and demolition waste and associated environmental impacts: a scientometric analysis

”,

Journal of Cleaner Production

, Vol.

287

, p.

125071

, doi:

https://doi.org/10.1016/J.JCLEPRO.2020.125071

.

Google Scholar

Crossref

Chidiac

,

S.E.

,

Moutassem

,

F.

and

Mahmoodzadeh

,

F.

(

2013

), “

Compressive strength model for concrete

”,

Magazine of Concrete Research

, Vol.

65

No.

9

, pp.

557

-

572

, doi:

https://doi.org/10.1680/MACR.12.00167

.

Google Scholar

Crossref

Chopra

,

P.

, et al. (

2018

), “

Comparison of machine learning techniques for the prediction of compressive strength of concrete

”,

Advances in Civil Engineering

, Vol.

2018

No.

1

, p.

5481705

, doi:

https://doi.org/10.1155/2018/5481705

.

Google Scholar

Crossref

Cui

,

K.

, et al. (

2023

), “

Mechanical behavior of multiscale hybrid fiber reinforced recycled aggregate concrete subject to uniaxial compression

”,

Journal of Building Engineering

, Vol.

71

, p.

106504

, doi:

https://doi.org/10.1016/J.JOBE.2023.106504

.

Google Scholar

Crossref

Dabiri

,

H.

, et al. (

2022

a), “

Compressive strength of concrete with recycled aggregate; a machine learning-based evaluation

”,

Cleaner Materials

, Vol.

3

, doi:

https://doi.org/10.1016/j.clema.2022.100044

.

Google Scholar

Dabiri

,

H.

, et al. (

2022

b), “

Compressive strength of concrete with recycled aggregate; a machine learning-based evaluation

”,

Cleaner Materials

, Vol.

3

, p.

100044

, doi:

https://doi.org/10.1016/J.CLEMA.2022.100044

.

Google Scholar

Crossref

Das

,

C.S.

, et al. (

2018

), “

Performance evaluation of polypropylene fibre reinforced recycled aggregate concrete

”,

Construction and Building Materials

, Vol.

189

, pp.

649

-

659

, doi:

https://doi.org/10.1016/j.conbuildmat.2018.09.036

.

Google Scholar

Crossref

De Larrard

,

F.

(

1999

), “

Concrete mixture proportioning: a scientific approach

”,

Concrete Mixture Proportioning

, doi:

https://doi.org/10.1201/9781482272055

.

Google Scholar

Deng

,

F.

, et al. (

2018

), “

Compressive strength prediction of recycled concrete based on deep learning

”,

Construction and Building Materials

, Vol.

175

, pp.

562

-

569

, doi:

https://doi.org/10.1016/j.conbuildmat.2018.04.169

.

Google Scholar

Crossref

Dong

,

J.F.

, et al. (

2022

), “

Freeze-thaw behaviour of basalt fibre reinforced recycled aggregate concrete filled CFRP tube specimens

”,

Engineering Structures

, Vol.

273

, p.

115088

, doi:

https://doi.org/10.1016/J.ENGSTRUCT.2022.115088

.

Google Scholar

Crossref

Ekanayake

,

I.U.

,

Meddage

,

D.P.P.

and

Rathnayake

,

U.

(

2022

), “

A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP)

”,

Case Studies in Construction Materials

, Vol.

16

, p.

e01059

, doi:

https://doi.org/10.1016/J.CSCM.2022.E01059

.

Google Scholar

Crossref

El Ouni

,

M.H.

, et al. (

2022

), “

Mechanical performance, water and chloride permeability of hybrid steel-polypropylene fiber-reinforced recycled aggregate concrete

”,

Case Studies in Construction Materials

, Vol.

16

, p.

e00831

, doi:

https://doi.org/10.1016/J.CSCM.2021.E00831

.

Google Scholar

Crossref

Feng

,

D.C.

, et al. (

2020

), “

Machine learning-based compressive strength prediction for concrete: an adaptive boosting approach

”,

Construction and Building Materials

, Vol.

230

, p.

117000

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2019.117000

.

Google Scholar

Crossref

Feret

,

R.

(

1892

), “

Sur la compacite des mortiers hydrauliques

”,

Ann. Pntas et Chaussees, Mem Doc

, Vol.

4

, pp.

5

-

164

,

available at:

Link to Sur la compacite des mortiers hydrauliquesLink to the cited article (

accessed

10 July 2024).

Google Scholar

Gao

,

D.

and

Wang

,

F.

(

2021

), “

Effects of recycled fine aggregate and steel fiber on compressive and splitting tensile properties of concrete

”,

Journal of Building Engineering

, Vol.

44

, doi:

https://doi.org/10.1016/j.jobe.2021.102631

.

Google Scholar

Gao

,

D.

, et al. (

2021

), “

Mechanical properties of recycled fine aggregate concrete incorporating different types of fibers

”,

Construction and Building Materials

, Vol.

298

, doi:

https://doi.org/10.1016/j.conbuildmat.2021.123732

.

Google Scholar

Gao

,

J.

,

Sun

,

W.

and

Morino

,

K.

(

1997

), “

Mechanical properties of steel fiber-reinforced, high-strength, lightweight concrete

”,

Cement and Concrete Composites

, Vol.

19

No.

4

, pp.

307

-

313

, doi:

https://doi.org/10.1016/S0958-9465(97)00023-1

.

Google Scholar

Crossref

Getahun

,

M.A.

,

Shitote

,

S.M.

and

Abiero Gariy

,

Z.C.

(

2018

), “

Artificial neural network based modelling approach for strength prediction of concrete incorporating agricultural and construction wastes

”,

Construction and Building Materials

, Vol.

190

, pp.

517

-

525

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2018.09.097

.

Google Scholar

Crossref

Gong

,

S.

, et al. (

2023

), “

Mechanical properties of polypropylene fiber recycled brick aggregate concrete and its influencing factors by gray correlation analysis

”,

Sustainability (Switzerland)

, Vol.

15

No.

14

, doi:

https://doi.org/10.3390/su151411135

.

Google Scholar

Grabiec

,

A.M.

, et al. (

2012

), “

Modification of recycled concrete aggregate by calcium carbonate biodeposition

”,

Construction and Building Materials

, Vol.

34

, pp.

145

-

150

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2012.02.027

.

Google Scholar

Crossref

Han

,

J.

,

Zhang

,

W.

and

Liu

,

Y.

(

2022

), “

Experimental study on freeze–thaw resistance of steel fiber-reinforced hydraulic concrete with two-grade aggregate

”,

Journal of Building Engineering

, Vol.

60

, p.

105181

, doi:

https://doi.org/10.1016/J.JOBE.2022.105181

.

Google Scholar

Crossref

Hanumesh

,

B.

,

Harish

,

B.

and

Venkata Ramana

,

N.

(

2018

), “

Influence of polypropylene fibres on recycled aggregate concrete

”,

in Materials Today: Proceedings. Elsevier Ltd

, Vol.

5

No.

1

, pp.

1147

-

1155

, doi:

https://doi.org/10.1016/j.matpr.2017.11.195

.

Google Scholar

Crossref

He

,

W.

, et al. (

2020

), “

Experimental investigation on the mechanical properties and microstructure of hybrid fiber reinforced recycled aggregate concrete

”,

Construction and Building Materials

, Vol.

261

, p.

120488

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2020.120488

.

Google Scholar

Crossref

Huang

,

M.

, et al. (

2021

), “

Mechanical properties test and strength prediction on basalt fiber reinforced recycled concrete

”,

Advances in Civil Engineering

, Vol.

2021

No.

1

, doi:

https://doi.org/10.1155/2021/6673416

.

Google Scholar

Islam

,

M.S.

and

Ahmed

,

S.J.

(

2018

), “

Influence of jute fiber on concrete properties

”,

Construction and Building Materials

, Vol.

189

, pp.

768

-

776

, doi:

https://doi.org/10.1016/j.conbuildmat.2018.09.048

.

Google Scholar

Crossref

Kachouh

,

N.

, et al. (

2021

), “

Shear behavior of steel-fiber-reinforced recycled aggregate concrete deep beams

”,

Buildings

, Vol.

11

No.

9

, doi:

https://doi.org/10.3390/buildings11090423

.

Google Scholar

Kaloop

,

M.R.

, et al. (

2020

), “

Compressive strength prediction of high-performance concrete using gradient tree boosting machine

”,

Construction and Building Materials

, Vol.

264

, p.

120198

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2020.120198

.

Google Scholar

Crossref

Kang

,

M.C.

,

Yoo

,

D.Y.

and

Gupta

,

R.

(

2021

), “

Machine learning-based prediction for compressive and flexural strengths of steel fiber-reinforced concrete

”,

Construction and Building Materials

, Vol.

266

, p.

121117

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2020.121117

.

Google Scholar

Crossref

Kang

,

W.H.

, et al. (

2017

), “

Reliability based design of RC beams with recycled aggregate and steel fibres

”,

Structures

, Vol.

11

, pp.

135

-

145

, doi:

https://doi.org/10.1016/J.ISTRUC.2017.05.002

.

Google Scholar

Crossref

Kaplan

,

G.

, et al. (

2022

), “

Usage of recycled fine aggregates obtained from concretes with low w/c ratio in the production of masonry plaster and mortar

”,

Environment, Development and Sustainability

, Vol.

24

No.

2

, pp.

2685

-

2714

, doi:

https://doi.org/10.1007/S10668-021-01551-5/METRICS

.

Google Scholar

Kaur

,

D.

, et al (

2023

),. “‘Compressive strength of bio-fibrous’ concrete”,

Lecture Notes in Civil Engineering

,

310

, pp.

423

-

433

, doi:

https://doi.org/10.1007/978-981-19-8024-4_36

.

Google Scholar

Crossref

Kaushik

,

S.

and

Bhan

,

P.S.

(

2024

), “

Chemical modifications of recycled concrete aggregate

”,

International Journal of Emerging Science and Engineering

, Vol.

12

No.

7

, pp.

7

-

12

, doi:

https://doi.org/10.35940/IJESE.G9900.12060724

.

Google Scholar

Crossref

Khademi

,

F.

, et al. (

2016

), “

Predicting strength of recycled aggregate concrete using artificial neural network, adaptive neuro-fuzzy inference system and multiple linear regression

”,

International Journal of Sustainable Built Environment

, Vol.

5

No.

2

, pp.

355

-

369

, doi:

https://doi.org/10.1016/j.ijsbe.2016.09.003

.

Google Scholar

Crossref

Kisku

,

N.

, et al. (

2017

), “

A critical review and assessment for usage of recycled aggregate as sustainable construction material

”,

Construction and Building Materials

, Vol.

131

, pp.

721

-

740

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2016.11.029

.

Google Scholar

Crossref

Lee

,

G.C.

and

Choi

,

H.B.

(

2013

), “

Study on interfacial transition zone properties of recycled aggregate by micro-hardness test

”,

Construction and Building Materials

, Vol.

40

, pp.

455

-

460

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2012.09.114

.

Google Scholar

Crossref

Lee

,

S.

(

2019

), “

Effect of nylon fiber addition on the performance of recycled aggregate concrete

”,

Applied Sciences

, Vol.

9

No.

4

, doi:

https://doi.org/10.3390/app9040767

.

Google Scholar

Li

,

B.

, et al. (

2024

), “

Specimen size effect on compressive and splitting tensile strengths of sustainable geopolymeric recycled aggregate concrete: experimental and theoretical analysis

”,

Journal of Cleaner Production

, Vol.

434

, p.

140154

, doi:

https://doi.org/10.1016/J.JCLEPRO.2023.140154

.

Google Scholar

Crossref

Li

,

H.

, et al. (

2022

), “

Compressive strength prediction of basalt fiber reinforced concrete via random Forest algorithm

”,

Materials Today Communications

, Vol.

30

, p.

103117

, doi:

https://doi.org/10.1016/J.MTCOMM.2021.103117

.

Google Scholar

Crossref

Lima

,

P.R.L.

, et al. (

2018

), “

Short sisal fiber reinforced recycled concrete block for one-way precast concrete slabs

”,

Construction and Building Materials

, Vol.

187

, pp.

620

-

634

, doi:

https://doi.org/10.1016/j.conbuildmat.2018.07.184

.

Google Scholar

Crossref

Lima

,

P.R.L.

, et al. (

2019

), “

Experimental and numerical analysis of short sisal fiber-cement composites produced with recycled matrix

”,

European Journal of Environmental and Civil Engineering

, Vol.

23

No.

1

, pp.

70

-

84

, doi:

https://doi.org/10.1080/19648189.2016.1271357

.

Google Scholar

Crossref

Liu

,

Y.

(

2022

), “

High-Performance concrete strength prediction based on machine learning

”,

Computational Intelligence and Neuroscience

, Vol.

2022

, doi:

https://doi.org/10.1155/2022/5802217

.

Google Scholar

Lyu

,

K.

, et al. (

2019

), “

The effect of rough vs. smooth aggregate surfaces on the characteristics of the interfacial transition zone

”,

Cement and Concrete Composites

, Vol.

99

, pp.

49

-

61

, doi:

https://doi.org/10.1016/J.CEMCONCOMP.2019.03.001

.

Google Scholar

Crossref

Mamen

,

B.

, et al. (

2022

), “

Experimental investigation on the mechanical behavior of concrete reinforced with alfa plant fibers

”,

Frattura ed Integrità Strutturale

, Vol.

16

No.

60

, pp.

102

-

113

, doi:

https://doi.org/10.3221/IGF-ESIS.60.08

.

Google Scholar

Crossref

Matar

,

P.

and

Assaad

,

J.J.

(

2019

), “

Concurrent effects of recycled aggregates and polypropylene fibers on workability and key strength properties of self-consolidating concrete

”,

Construction and Building Materials

, Vol.

199

, pp.

492

-

500

, doi:

https://doi.org/10.1016/j.conbuildmat.2018.12.091

.

Google Scholar

Crossref

Matar

,

P.

and

Zéhil

,

G.P.

(

2019

), “

Effects of polypropylene fibers on the physical and mechanical properties of recycled aggregate concrete

”,

Journal of Wuhan University of Technology-Mater. Sci. Ed

, Vol.

34

No.

6

, pp.

1327

-

1344

, doi:

https://doi.org/10.1007/s11595-019-2196-6

.

Google Scholar

Crossref

Meddage

,

D.P.P.

, et al. (

2022

), “

Explainable machine learning (XML) to predict external wind pressure of a low-rise building in urban-like settings

”,

Journal of Wind Engineering and Industrial Aerodynamics

, Vol.

226

, p.

105027

, doi:

https://doi.org/10.1016/J.JWEIA.2022.105027

.

Google Scholar

Crossref

Meddage

,

D.P.P.

, et al. (

2024

), “

An explainable machine learning approach to predict the compressive strength of graphene oxide-based concrete

”,

Construction and Building Materials

, Vol.

449

, p.

138346

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2024.138346

.

Google Scholar

Crossref

Meddage

,

D.P.P.

,

Mohotti

,

D.

and

Wijesooriya

,

K.

(

2024

), “

Predicting transient wind loads on tall buildings in three-dimensional spatial coordinates using machine learning

”,

Journal of Building Engineering

, Vol.

85

, p.

108725

, doi:

https://doi.org/10.1016/J.JOBE.2024.108725

.

Google Scholar

Crossref

Meesala

,

C.R.

(

2019

), “

Influence of different types of fiber on the properties of recycled aggregate concrete

”,

Structural Concrete

, Vol.

20

No.

5

, pp.

1656

-

1669

, doi:

https://doi.org/10.1002/SUCO.201900052

.

Google Scholar

Crossref

Mistri

,

A.

, et al. (

2020

), “

A review on different treatment methods for enhancing the properties of recycled aggregates for sustainable construction materials

”,

Construction and Building Materials

, Vol.

233

, p.

117894

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2019.117894

.

Google Scholar

Crossref

Mohammad

,

W.N.S.W.

,

Ismail

,

S.

and

Alwi

,

W.A.W.

(

2016

), “

Properties of recycled aggregate concrete reinforced with polypropylene fibre

”,

MATEC Web of Conferences

, Vol.

66

, p.

77

.

https://doi.org/10.1051/MATECCONF/20166600077

.

Google Scholar

Crossref

Nassif

,

H.

, et al. (

2022

), “

Restrained shrinkage of High-Performance Ready-Mix concrete reinforced with low volume fraction of hybrid fibers

”,

Polymers

, Vol.

14

No.

22

, p.

4934

, doi:

https://doi.org/10.3390/POLYM14224934

.

Google Scholar

Crossref

PubMed

Nataraja

,

M.C.

,

Dhang

,

N.

and

Gupta

,

A.P.

(

1999

), “

Stress–strain curves for steel-fiber reinforced concrete under compression

”,

Cement and Concrete Composites

, Vol.

21

Nos

5-6

, pp.

383

-

390

, doi:

https://doi.org/10.1016/S0958-9465(99)00021-9

.

Google Scholar

Crossref

Nitesh

,

K.J.N.S.

,

Rao

,

S.V.

and

Kumar

,

P.R.

(

2019

), “

An experimental investigation on torsional behaviour of recycled aggregate based steel fiber reinforced self compacting concrete

”,

Journal of Building Engineering. Elsevier Ltd

, Vol.

22

, pp.

242

-

251

, doi:

https://doi.org/10.1016/j.jobe.2018.12.011

.

Google Scholar

Crossref

Nithurshan

,

M.

and

Elakneswaran

,

Y.

(

2023

), “

A systematic review and assessment of concrete strength prediction models

”,

Case Studies in Construction Materials

, Vol.

18

, p.

e01830

, doi:

https://doi.org/10.1016/J.CSCM.2023.E01830

.

Google Scholar

Crossref

Osra

,

F.A.

, et al. (

2024

), “

Environmental impact assessment of a dumping site: a case study of Kakia dumping site

”,

Sustainability

, Vol.

16

No.

10

, p.

3882

, doi:

https://doi.org/10.3390/SU16103882

.

Google Scholar

Crossref

Pakravan

,

H.R.

,

Latifi

,

M.

and

Jamshidi

,

M.

(

2017

), “

Hybrid short fiber reinforcement system in concrete: a review

”,

Construction and Building Materials

, Vol.

142

, pp.

280

-

294

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2017.03.059

.

Google Scholar

Crossref

Pakzad

,

S.S.

,

Roshan

,

N.

and

Ghalehnovi

,

M.

(

2023

), “

Comparison of various machine learning algorithms used for compressive strength prediction of steel fiber-reinforced concrete

”,

Scientific Reports

, Vol.

13

No.

1

, pp.

1

-

15

, doi:

https://doi.org/10.1038/s41598-023-30606-y

.

Google Scholar

Crossref

PubMed

Pal

,

A.

, et al. (

2023

), “

Machine learning models for predicting compressive strength of fiber-reinforced concrete containing waste rubber and recycled aggregate

”,

Journal of Cleaner Production

, Vol.

423

, p.

138673

, doi:

https://doi.org/10.1016/J.JCLEPRO.2023.138673

.

Google Scholar

Crossref

Pan

,

X.

, et al. (

2017

), “

A review on concrete surface treatment part I: types and mechanisms

”,

Construction and Building Materials

, Vol.

132

, pp.

578

-

590

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2016.12.025

.

Google Scholar

Crossref

Patil

,

S.V.

,

Balakrishna Rao

,

K.

and

Nayak

,

G.

(

2023

), “

Prediction of recycled coarse aggregate concrete mechanical properties using multiple linear regression and artificial neural network

”,

Journal of Engineering, Design and Technology

, Vol.

21

No.

6

, pp.

1690

-

1709

, doi:

https://doi.org/10.1108/JEDT-07-2021-0373

.

Google Scholar

Crossref

Poon

,

C.S.

, et al. (

2004

), “

Influence of moisture states of natural and recycled aggregates on the slump and compressive strength of concrete

”,

Cement and Concrete Research

, Vol.

34

No.

1

, pp.

31

-

36

, doi:

https://doi.org/10.1016/S0008-8846(03)00186-8

.

Google Scholar

Crossref

Popovics

,

S.

(

1985

), “

New formulas for the prediction of the effect of porosity on concrete strength

”,

Journal Proceedings

, Vol.

82

No.

2

, pp.

136

-

146

, doi:

https://doi.org/10.14359/10321

.

Google Scholar

Crossref

Pourbaba

,

M.

, et al. (

2018

), “

Effect of age on the compressive strength of ultra-high-performance fiber-reinforced concrete

”,

Construction and Building Materials

, Vol.

175

, pp.

402

-

410

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2018.04.203

.

Google Scholar

Crossref

Qiu

,

J.

,

Tng

,

D.Q.S.

and

Yang

,

E.H.

(

2014

), “

Surface treatment of recycled concrete aggregates through microbial carbonate precipitation

”,

Construction and Building Materials

, Vol.

57

, pp.

144

-

150

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2014.01.085

.

Google Scholar

Crossref

Ramesh

,

R.B.

,

Mirza

,

O.

and

Kang

,

W.H.

(

2019

), “

Mechanical properties of steel fiber reinforced recycled aggregate concrete

”,

Structural Concrete

, Vol.

20

No.

2

, pp.

745

-

755

, doi:

https://doi.org/10.1002/suco.201800156

.

Google Scholar

Crossref

Rebai

,

B.

, et al. (

2024

), “

Evaluation of self-compacting concrete for concrete repair applications

”,

Research on Engineering Structures & Materials

, Vol.

11

No.

2

, pp.

495

-

513

, doi:

https://doi.org/10.17515/RESM2024.255ST0423RS

.

Google Scholar

Crossref

Ren

,

G.

, et al. (

2021

), “

Influence of sisal fibers on the mechanical performance of ultra-high performance concretes

”,

Construction and Building Materials

, Vol.

286

, doi:

https://doi.org/10.1016/j.conbuildmat.2021.122958

.

Google Scholar

Sabău

,

M.

and

Remolina Duran

,

J.

(

2022

), “

Prediction of compressive strength of General-Use concrete mixes with recycled concrete aggregate

”,

International Journal of Pavement Research and Technology

, Vol.

15

No.

1

, pp.

73

-

85

, doi:

https://doi.org/10.1007/S42947-021-00012-6/METRICS

.

Google Scholar

Salami

,

B.A.

, et al. (

2021

), “

Data-driven model for ternary-blend concrete compressive strength prediction using machine learning approach

”,

Construction and Building Materials

, Vol.

301

, p.

124152

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2021.124152

.

Google Scholar

Crossref

Senaratne

,

S.

, et al. (

2016

), “

The costs and benefits of combining recycled aggregate with steel fibres as a sustainable, structural material

”,

Journal of Cleaner Production

, Vol.

112

, pp.

2318

-

2327

, doi:

https://doi.org/10.1016/j.jclepro.2015.10.041

.

Google Scholar

Crossref

Shafighfard

,

T.

, et al. (

2022

), “

Data-driven compressive strength prediction of steel fiber reinforced concrete (SFRC) subjected to elevated temperatures using stacked machine learning algorithms

”,

Journal of Materials Research and Technology

, Vol.

21

, pp.

3777

-

3794

, doi:

https://doi.org/10.1016/J.JMRT.2022.10.153

.

Google Scholar

Crossref

Shi

,

C.

, et al. (

2016

), “

Performance enhancement of recycled concrete aggregate – a review

”,

Journal of Cleaner Production

, Vol.

112

, pp.

466

-

472

, doi:

https://doi.org/10.1016/J.JCLEPRO.2015.08.057

.

Google Scholar

Crossref

Shi

,

C.

, et al. (

2018

), “

Performance of mortar prepared with recycled concrete aggregate enhanced by CO2 and pozzolan slurry

”,

Cement and Concrete Composites

, Vol.

86

, pp.

130

-

138

, doi:

https://doi.org/10.1016/J.CEMCONCOMP.2017.10.013

.

Google Scholar

Crossref

Silva

,

R.V.

,

de Brito

,

J.

and

Dhir

,

R.K.

(

2019

), “

Use of recycled aggregates arising from construction and demolition waste in new construction applications

”,

Journal of Cleaner Production

, Vol.

236

, p.

117629

, doi:

https://doi.org/10.1016/J.JCLEPRO.2019.117629

.

Google Scholar

Crossref

Spaeth

,

V.

and

Djerbi Tegguer

,

A.

(

2013

), “

Improvement of recycled concrete aggregate properties by polymer treatments

”,

International Journal of Sustainable Built Environment

, Vol.

2

No.

2

, pp.

143

-

152

, doi:

https://doi.org/10.1016/J.IJSBE.2014.03.003

.

Google Scholar

Crossref

Sultana

,

N.

, et al. (

2020

), “

Soft computing approaches for comparative prediction of the mechanical properties of jute fiber reinforced concrete

”,

Advances in Engineering Software

, Vol.

149

, p.

102887

, doi:

https://doi.org/10.1016/J.ADVENGSOFT.2020.102887

.

Google Scholar

Crossref

Tam

,

V.W.Y.

,

Tam

,

C.M.

and

Le

,

K.N.

(

2007

), “

Removal of cement mortar remains from recycled aggregate using pre-soaking approaches

”,

Resources, Conservation and Recycling

, Vol.

50

No.

1

, pp.

82

-

101

, doi:

https://doi.org/10.1016/J.RESCONREC.2006.05.012

.

Google Scholar

Crossref

Thomas

,

J.

and

Ramaswamy

,

A.

(

2007

), “

Mechanical properties of steel Fiber-Reinforced concrete

”,

Journal of Materials in Civil Engineering

, Vol.

19

No.

5

, pp.

385

-

392

, doi:

https://doi.org/10.1061/(ASCE)0899-1561(2007)19:5(385)

.

Google Scholar

Crossref

Vijayan

,

V.

,

Jayakesh

,

K.

and

Anand

,

K.B.

(

2022

), “Mechanical properties of recycled aggregates concrete with sisal fiber and silica Fume”,

Materials Today: Proceedings

,

Elsevier

, pp.

1887

-

1894

, doi:

https://doi.org/10.1016/j.matpr.2022.05.055

.

Google Scholar

Crossref

Wang

,

B.

, et al. (

2021

), “

A comprehensive review on recycled aggregate and recycled aggregate concrete

”,

Resources, Conservation and Recycling

, Vol.

171

, p.

105565

, doi:

https://doi.org/10.1016/J.RESCONREC.2021.105565

.

Google Scholar

Crossref

Wu

,

L.

,

Sun

,

Z.

and

Cao

,

Y.

(

2024

), “

Modification of recycled aggregate and conservation and application of recycled aggregate concrete: a review

”,

Construction and Building Materials

, Vol.

431

, p.

136567

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2024.136567

.

Google Scholar

Crossref

Xi

,

B.

, et al. (

2024

), “

LGBM-based modeling scenarios to compressive strength of recycled aggregate concrete with SHAP analysis

”,

Mechanics of Advanced Materials and Structures

, Vol.

31

No.

23

, pp.

5999

-

6014

, doi:

https://doi.org/10.1080/15376494.2023.2224782

.

Google Scholar

Crossref

Yu

,

Y.

, et al. (

2018

), “

A novel optimised self-learning method for compressive strength prediction of high performance concrete

”,

Construction and Building Materials

, Vol.

184

, pp.

229

-

247

, doi:

https://doi.org/10.1016/J.CONBUILDMAT.2018.06.219

.

Google Scholar

Crossref

Yuan

,

Z.

,

Wang

,

L.N.

and

Ji

,

X.

(

2014

), “

Prediction of concrete compressive strength: research on hybrid models genetic based algorithms and ANFIS

”,

Advances in Engineering Software

, Vol.

67

, pp.

156

-

163

, doi:

https://doi.org/10.1016/J.ADVENGSOFT.2013.09.004

.

Google Scholar

Crossref

Zaid

,

O.

, et al. (

2022

), “

Characteristics of high-performance steel fiber reinforced recycled aggregate concrete utilizing mineral filler

”,

Case Studies in Construction Materials

, Vol.

16

, p.

e00939

, doi:

https://doi.org/10.1016/J.CSCM.2022.E00939

.

Google Scholar

Crossref

Zakaria

,

M.

, et al. (

2015

), “

Effect of jute yarn on the mechanical behavior of concrete composites

”,

SpringerPlus

, Vol.

4

No.

1

, pp.

1

-

8

, doi:

https://doi.org/10.1186/s40064-015-1504-7

.

Google Scholar

Crossref

PubMed

Zari

,

M.

(

2024

), “‘Characteristics and impact assessment of municipal solid waste (MSW)’”,

Springer Water, Part F2437

, pp.

93

-

113

, doi:

https://doi.org/10.1007/978-3-031-52633-6_3

.

Google Scholar

Crossref

Zhang

,

T.

, et al. (

2020

), “

Mechanical properties of jute fiber-reinforced high-strength concrete

”,

Structural Concrete

, Vol.

21

No.

2

, pp.

703

-

712

, doi:

https://doi.org/10.1002/suco.201900012

.

Google Scholar

Crossref

Zhang

,

C.

, et al. (

2021

), “

Mechanical properties and microstructure of basalt fiber-reinforced recycled concrete

”,

Journal of Cleaner Production

, Vol.

278

, p.

123252

, doi:

https://doi.org/10.1016/J.JCLEPRO.2020.123252

.

Google Scholar

Crossref

Zhang

,

G.

, et al. (

2022

a), “

Properties of sustainable self-compacting concrete containing activated jute fiber and waste mineral powders

”,

Journal of Materials Research and Technology

, Vol.

19

, pp.

1740

-

1758

, doi:

https://doi.org/10.1016/J.JMRT.2022.05.148

.

Google Scholar

Crossref

Zhang

,

T.

, et al. (

2022

b), “

Investigation of impact resistance of high-performance polypropylene Fiber-Reinforced recycled aggregate concrete

”,

Crystals

, Vol.

12

No.

5

, doi:

https://doi.org/10.3390/cryst12050669

.

Google Scholar

Zheng

,

C.

, et al. (

2018

), “

Mechanical properties of recycled concrete with demolished waste concrete aggregate and clay brick aggregate

”,

Results in Physics

, Vol.

9

, pp.

1317

-

1322

, doi:

https://doi.org/10.1016/J.RINP.2018.04.061

.

Google Scholar

Crossref

Zhou

,

C.

, et al. (

2020

), “

Mechanical and damping properties of recycled aggregate concrete modified with air-entraining agent and polypropylene fiber

”,

Materials

, Vol.

13

No.

8

, doi:

https://doi.org/10.3390/MA13082004

.

Google Scholar

2025

Md Rabiul Hasan, Aojoy Kumar Shuvo, Ehsanul Bashar Pranto, Mehedi Hasan and Md Mintu Miah

Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence maybe seen at Link to the terms of the CC BY 4.0 licenceLink to the terms of the CC BY 4.0 licence.

Data-driven prediction of concrete strength by machine learning: hybrid-fiber-reinforced recycled aggregate concrete

1. Introduction

1.1 Background of prediction models

2. Materials and method

2.1 Dataset development

2.1.1 Statistical analysis of the data set

2.1.2 Correlation among the features

2.2 Preprocessing of data

2.2.1 Categorical encoding

2.2.2 Data set standardization

2.3 Model validation and performance metrics

2.4 Hyperparameter tuning

3. Result and discussion

3.1 Model performance

3.1.1 Rank-wise analysis

3.2 Rate of residual error

3.3 Sensitivity analysis

3.4 Model comparison and statistical scores

4. Conclusion

References

Email Alerts

Cited By

Data-driven prediction of concrete strength by machine learning: hybrid-fiber-reinforced recycled aggregate concrete

1. Introduction

1.1 Background of prediction models

2. Materials and method

2.1 Dataset development

2.1.1 Statistical analysis of the data set

2.1.2 Correlation among the features

2.2 Preprocessing of data

2.2.1 Categorical encoding

2.2.2 Data set standardization

2.3 Model validation and performance metrics

2.4 Hyperparameter tuning

3. Result and discussion

3.1 Model performance

3.1.1 Rank-wise analysis

3.2 Rate of residual error

3.3 Sensitivity analysis

3.4 Model comparison and statistical scores

4. Conclusion

References

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

Sharing Unavailable