To achieve digitalisation, construction organisations need strategies to help align existing business structures with emerging digital technologies. To this end, this paper presents the findings of an assessment of the strategies required for construction digitalisation using South Africa as a point of reference.
The study adopted a mixed-method design using a Delphi and questionnaire survey, while the critical strategies for digital construction were identified using three different machine learning (ML)-based sensitivity analyses.
The study, through factor analysis, found five major groups of strategies: (1) understanding the construction market, (2) creating a digital culture, (3) technology deployment and assessment, (4) communication management and (5) finance. However, the three ML-based sensitivity analyses all revealed that technology deployment and assessment, as well as understanding the construction market, are the two most important strategies for construction digitalisation in South Africa.
The paper offers practical guidelines for construction organisations to be digitalised. It also offers methodological contributions to using ML in survey studies within construction. Theoretically, the study provides a foundation for future studies on strategies for construction digitalisation – an aspect that has received less attention in the current construction digitalisation discourse.
1. Introduction
The fourth industrial revolution, driven by physical, biological, and digital technologies, has continued to revolutionise diverse industries worldwide (Schwab, 2017). A remarkable emergence of this industrial revolution is digitalisation, which has been described as the ongoing deployment of digital technologies and digitised data across diverse economies and societies (Riso and Adăscăliț;ei, 2024). According to Gartner Glossary (2018), organisations achieve digitalisation by using digital technologies to enhance business models and provide new revenue and opportunities that add value to their businesses. Berger (2016) described digitalisation as using information and communication technology-oriented tools and procedures to deliver services successfully. Additionally, Bahl (2015) described it as an innovation that links technology, business strategy, devices, data science, and design to change business processes. Digitalisation offers organisations the opportunity to deliver their services more easily and faster through cutting-edge technologies, affording them a competitive advantage. Evidence of digitalisation has been noted in many large global organisations such as Amazon, Faulkner Hayes, Netflix, and Lego, among others (Impact, 2024). These organisations have leveraged emerging digital technologies to meet their unique needs, transform their businesses, extend their reach, achieve customer satisfaction, and attain competitive advantage. As digital technologies advance, their impact is felt across diverse industries, and construction is no exception. With construction organisations designed to offer value to construction clients while also making profits, the clamour for digitalisation is beginning to resound among construction practitioners and researchers. To this end, drawing from previous descriptions of digitalisation, Aghimien et al. (2021) described digitalisation in construction as adopting digital technologies in place of human effort to deliver construction services that are satisfactory to the client and for which the organisation can attain a competitive advantage. Smith (2024) noted that digitalisation offers construction organisations better efficiency, time and cost savings and improved cooperation between partners in the construction supply chain. It was further noted that construction organisations could improve efficiency and gain a competitive advantage by adopting emerging technologies such as building information modelling, digital twins, artificial intelligence, drones, and augmented and virtual realities, among others. Moreover, studies have also noted that the application of digital technologies promises immense benefits to age-long construction problems of delivering projects above budget, beyond the expected time, and below the agreed specification (Aghimien et al., 2021; Delgado et al., 2019; Oke et al., 2018).
Based on these inherent benefits of emerging technologies, several guidelines for the implementation of these technologies in organisations across diverse sectors have been established (Gill et al., 2016; Kane et al., 2015; Newman, 2017). One consistent theme across these guidelines is the need for organisations to be strategic in their implementation to ensure the successful integration of emerging technologies into existing business operations. Kane et al. (2015) noted that digitalisation in an organisation is not limited to acquiring new digital technologies but includes a strategic decision-making process. This submission is evident in construction, as Evans-Greenwood et al. (2019) noted that, given the industry's complex nature, no single digital technology will change it. However, a strategic use of the confluence of trends will allow the adoption of emerging digital technologies in a unique and disruptive manner that will bring about meaningful transformation. Also, exploring from a project-based firm perspective, Cao et al. (2023) emphasised the need for proper strategies under institutional pressure in a pre-digital era. However, the Economist Intelligence Unit (2017) earlier noted that most organisations find it hard to decide on the applicable strategy for innovation deployment that will transform their business. The situation is no different for construction organisations, where the adoption of digital technologies is slow-paced (Aghimien et al., 2021). Sawhney and Knight (2023) noted that the use of emerging digital technologies in construction as of 2023 can best be described as a “half-full or half-empty' situation where, despite the myriad opportunities inherent in emerging technologies, their adoption is still not increasing when compared to the previous year. The slow pace of the implementation of emerging digital technologies in construction organisations can be attributed to the paucity of construction-specific studies that identify the strategies required for digitalisation. Many of the digitalisation studies from within the construction industry have primarily focused on the use of emerging digital technologies to solve specific construction-related problems, the behavioural intention to adopt, and the factors influencing adoption and their benefits. As such, it is essential to identify key strategies that construction organisations, and small) seeking digitalisation and the benefits therein can adopt as a roadmap to achieve this digital transformation. Herein lies the gap that this study hopes to fill.
In addressing the gap in the literature regarding strategies for construction digitalisation, this study acknowledges that not all organisations have the same number of resources and capabilities to adopt digitalisation at once. This is based on the notion that large construction companies can adopt the identified strategies simultaneously. However, small and medium enterprises, which comprise the largest share of organisations in the construction industry of developing countries, are challenged by resource constraints and may require a more piecemeal approach to embracing digitalisation. This implies that these small and medium organisations might need to take their digitalisation journey one step at a time. As such, to prioritise these strategies for easy identification and adoption by those organisations that might not be able to implement them all at once, the study employed Machine Learning (ML) classification algorithms. The use of ML in construction survey research is uncommon. Albahri et al. (2022) noted that studies promoting ML in surveys only started emerging in diverse fields. As such, this current study makes a methodological contribution to using ML for survey studies within the construction domain. Furthermore, the study offers practical guidelines for construction organisations to digitalise their businesses. Theoretically, the study provides a foundation for future studies on the strategies for construction digitalisation – an aspect that has received less attention in the current construction digitalisation discourse.
2. Strategies for construction digitalisation
In the quest for digitalisation of organisations, several guidelines and frameworks have been proposed mainly within the business sector (Boström and Celik, 2017; Leyh et al., 2021; Newman, 2017), education (Sheikhshoaei et al., 2018), healthcare (Dyk and Schutte, 2012), manufacturing (Gökalp and Martinez, 2021; Vivares et al., 2018), and telecommunication (Valdez-de-Leon, 2016). For instance, Gill et al. (2016) developed a framework that serves as a road map for business organisations to achieve digital maturity. This framework emphasised the need for the organisation to align with digital strategies, governance, and execution. Similarly, Boström and Celik (2017) proposed a digital strategising model to guide business owners with an emphasis on communication, value measurement, leadership, ecosystem, technology, and skill. Deloitte (2018) also proposed a digitalisation model for business with five key areas: strategy, customer, technology, operations, and organisation and culture. Likewise, in manufacturing, Gökalp and Martinez (2021) developed a digital transformation model that considers strategic governance and covers organisations' strategy. In the telecommunication sector, Valdez-de-Leon (2016) presented strategy and technology as crucial elements for digitalising communication service providers. A similar observation was made within the communication sector (Newman, 2017).
A key emerging theme from these frameworks is the emphasis on organisational strategy for achieving digitalisation. Kane et al. (2015) have noted that for organisations to achieve digitalisation, a strategic decision-making process to reshape the organisation's business model must be implemented alongside the acquisition of available technologies. Wikström et al. (2010) have also noted a strong relationship between an organisation's strategy and business model. As such, if digitalisation, which entails using digital technologies to transform business models (Gartner Glossary, 2018), is to be attained within construction, careful consideration needs to be given to the strategies adopted within the organisation. However, Fleisger (2018) has noted that strategy can mean different things to different people or organisations. It was further noted that from an organisational perspective, a strategy evolves and must be carefully crafted and effectively executed for any meaningful outcome to be derived. As such, Pidun (2019) described organisational strategy as the approach adopted by an organisation to ensure market competitiveness.
In relation to digitalisation, there is a similarity between the organisational strategy and the strategy adopted to attain digitalisation. Bharadwaj et al. (2013) described the digitalisation strategy as the process of leveraging digital resources to formulate and execute organisational strategy to achieve differential value. Hess et al. (2016) described it as the guidelines taken by an organisation to digitally transform its business by integrating emerging digital technologies. Matt et al. (2015) also described digitalisation strategy as transforming organisational processes, products and services to meet new technologies. As such, Turuk (2020) concluded that having a solid digitalisation strategy is an inevitable requirement for organisations hoping to succeed in the current digital world. According to Ernest and Young Global (2018), while emerging digital technologies propose mouth-watering benefits to an organisation's processes and outcomes, attaining these benefits will only be a dream if a well-planned strategy is not in place. Clearly, acquiring digital technologies is not enough to boost construction organisations' digital transformation and value delivery. Having a strategy to provide value to clients and retain these clients is essential. This is evident in the case of LEGO – an iconic toy manufacturing company that, in view of avoiding bankruptcy in 2004, commenced its digital transformation journey. The company implemented a comprehensive digital transformation strategy to improve its supply chain, product development, and customer experience. It streamlined its decision-making process, set a direction for future technological principles, digitalised its products, systems and processes, and ensured it promoted a digital workforce (Andersen and Ross, 2016).
Based on the above, several strategies need to be in place to achieve effective digitalisation in construction. Sprokholt et al. (2018) proposed that digitising products and processes, having smart integration, and connecting customers with a multi-sided ecosystem are the four key digitalisation strategy positions. To digitise the business process and product, organisations need to invest in emerging digital technologies and related skills (Dyk and Schutted, 2012; Newman, 2017). According to Newman (2017), there is a need for adequate financing and investment in digital technologies, which will help gain market intelligence and promote innovation. Moreover, Evans-Greenwood et al. (2019) noted that for construction organisations to capitalise on the disruption offered by emerging digital technologies, top management must identify and invest in key enabling trends to give themselves the real option of exploring these opportunities. Considering the nature of construction and its inherent challenges, diverse technological trends have emerged to help improve the construction process. Various management software like the Bidhive for bid management, BRIX for construction enterprise resource planning, Procore for improved communication on construction projects, Buildertrend for improved client experience, and The EDGE for material take off, among others, have been developed to ensure a digitised construction process (Olmstead, 2024). Employing these emerging software along with other digital technologies can help simplify and digitise the construction process and delivery.
Similarly, Mapingire et al. (2022) submitted that for digitalisation to occur, organisations must be able to use emerging technologies to change their processes and how their employees work. Aghimien et al. (2021) have noted that digitalisation transcends acquiring emerging digital tools but also having the right skilled workers to drive the implementation of these technologies to attain strategic organisational goals. Therefore, an important strategy for any organisation will be to invest in emerging technologies and train the existing workforce to use these technologies (Valdez-de-Leon, 2016). To invest in emerging technologies and select the right technology, construction organisations must be able to identify technologies that can benefit them. Investing in forward-thinking research and development (R&D) can achieve this. According to Teece (2007), R&D is important as it allows organisations to sense opportunities within their environment.
According to Mapingire et al. (2022), organisations must be able to digitise their customer experience. In the same vein, Gill et al. (2016) noted that an important strategy is how well a company uses customer and business data to measure success and inform new approaches. A typical example is using customer buying patterns by companies such as Amazon and Netflix to deliver tailor-made services (Impact, 2024). Also, Quinton et al. (2018) noted a need to prioritise forecasting clients' needs through proper evaluation of available data. This is crucial, particularly for construction organisations as they operate in a highly competitive environment and strive to satisfy clients ever-changing needs. Putting mechanisms in place and promoting a culture that ensures clients' future needs are forecasted is essential. Moreover, construction projects produce large amounts of data that can be used to make informed decisions about future projects and clients. Effectively storing, analysing and using these data through technologies like big data analytics and cloud computing can prove beneficial to construction organisations. As such, Turuk (2020) proposed using cloud computing and ensuring customer interaction as strategies for digitalisation within organisations.
Using feedback systems to improve organisations' digitalisation implementation has also been identified. De Carolis et al. (2017) noted that digitalisation processes employed within an organisation should be monitored and controlled using the careful assessment of feedback received from their execution. In construction, this feedback is obtained from the project stakeholders and is an important means of acquiring knowledge for improving subsequent projects and adopted technologies. Also, drawing from the dynamic capability theory, designed to ensure organisations have sustained competitive advantage (Teece et al., 1997), several strategies exist for organisations to strategically improve their business. Careful selection of appropriate digital technology, careful selection of market and network, and clear project and service boundaries (Teece, 2007) apply to the digitalisation of construction organisations. Through forward-thinking R&D, these organisations can effectively select the right tools that align with their business. This alignment becomes crucial as aligning digital technology with the business through effective planning is germane to digitalisation (Newman, 2017; Valdez-de-Leon, 2016). Based on the above, Table 1 summarises the strategies assessed in the context of the South African construction industry.
Summary of the strategies for construction digitalisation
| Code | Strategies | Authors |
|---|---|---|
| SC1 | Promote and invest in research and development | Teece (2007) |
| SC2 | Careful selection of digital technologies | Teece (2007) |
| SC3 | Prioritise forecasting of clients' need | Quinton et al. (2018) |
| SC4 | Careful selection of choice of market and network | Teece (2007) |
| SC5 | Clear construction project and service boundary | Teece (2007) |
| SC6 | Prioritise the evaluation of clients' feedback | Bharadwaj et al. (2013), De Carolis et al. (2017) |
| SC7 | Invest in emerging digital technologies and related skills | Dyk and Schutte (2012), Newman (2017) |
| SC8 | Promote effective decision-making procedure | Teece (2007) |
| SC9 | Create measures for digital performance measurement | Gill et al. (2016), Teece (2007) |
| SC10 | Avoid decision errors through the use of technologies | Teece (2007) |
| SC11 | Ensure effective information and communication management | Bennis (2013) |
| SC12 | Align digital technology with the business through effective planning | Newman (2017), Valdez-de-Leon (2016) |
| SC13 | Create a digital risk culture within the organisation | McKeown and Philip (2003) |
| SC14 | Promote open innovation within the organisation | Teece (2007) |
| Code | Strategies | Authors |
|---|---|---|
| SC1 | Promote and invest in research and development | |
| SC2 | Careful selection of digital technologies | |
| SC3 | Prioritise forecasting of clients' need | |
| SC4 | Careful selection of choice of market and network | |
| SC5 | Clear construction project and service boundary | |
| SC6 | Prioritise the evaluation of clients' feedback | |
| SC7 | Invest in emerging digital technologies and related skills | |
| SC8 | Promote effective decision-making procedure | |
| SC9 | Create measures for digital performance measurement | |
| SC10 | Avoid decision errors through the use of technologies | |
| SC11 | Ensure effective information and communication management | |
| SC12 | Align digital technology with the business through effective planning | |
| SC13 | Create a digital risk culture within the organisation | |
| SC14 | Promote open innovation within the organisation |
3. Research methodology
In understanding the strategy construction organisations can adopt in their quest for digitalisation, pragmatic thinking was employed. This informed the use of a mixed-method research design.
3.1 Qualitative strand
The qualitative strand, through a Delphi, informed the questions asked in the quantitative strand of the study. The Delphi, a consensus-attaining process, has gained attention in recent construction-related studies as a means of forecasting and solving problems from experts' perspectives (Alomari et al., 2018; Ameyaw et al., 2016; Chan et al., 2001; Gohdes and Crews, 2004). Hallowell and Gambatese (2010) submitted that the Delphi is most suitable when intuitive judgment takes precedence over questions that can be answered through concrete measurements. However, it was further noted that significant variation exists in emerging Delphi studies, with these variations designed to suit diverse study objectives. For instance, in traditional Delphi studies open-ended questions are asked at the initial stage to gather variables, while subsequent stages are used to evaluate, prioritise or confirm these variables (Keeney et al., 2001). However, studies have emerged on using open and closed-ended questions even in earlier stages in cases where preliminary review has already revealed specific variables that require validation within the new context of study (Ameyaw et al., 2016). As such, for this current study, the Delphi method through open and closed-ended questions was used to confirm the applicability of the 14 strategies presented in Table 1. In doing this, following the suggestion in past Delphi studies, 32 experts were invited from construction organisations across South Africa. The selection of these experts followed past suggestions and criteria such as extensive years of experience, employment in a construction organisation or higher institution, professional membership, and a degree in a construction-related field (Ameyaw et al., 2016; Alomari et al., 2018; Chan et al., 2001; Hallowell and Gambatese, 2010). These experts were invited using an invitation letter stating clearly the objectives of the study, guaranteeing them of anonymity throughout the process and their right to withdraw from the study at any point in time. Based on the above, 13 experts completed the two-round Delphi process. To determine consensus among these experts, the study used conventional methods such as median, interquartile deviation, Kendall's coefficient of concordance, and chi-square (χ2) derived from the feedback analyses from each round. At the end of the second round, a consensus was reached, and the fourteen strategies were adjudged as applicable to the construction domain. No additional variable was provided by the experts in the open-ended question.
3.2 Quantitative strand
The outcome of the Delphi led to the development of the research questionnaire, which targeted core construction professionals with at least five years of working experience in the South African construction industry and who were actively involved in construction projects in the country. These professionals include architects, engineers, construction and project managers, and quantity surveyors. The respective professional bodies of these construction professionals revealed a total population of 40,188 members (Engineering Council of South Africa, 2019; South African Council for the Architectural Profession, 2019; South African Council for Project and Construction Management Profession, 2018; South African Council for the Quantity Surveying Profession, 2018). Since this entire population is large, Cochran's sample size calculation with a 90% confidence level, a ±7% margin of error and a 0.5 estimated proportion of the population was employed to reduce the population to a manageable sample size of 546. The sampling approach adopted was snowball as it was difficult to determine the exact number of professionals with the set years of experience and practice at the time of the research. To avoid the potential sample bias and limited generalisability associated with snowball sampling, the study ensured that the target population was clearly defined and that the initial sets of professionals approached were diverse based on the different professions that had been identified earlier. Studies have noted that in questionnaire surveys, it is common to face challenges in obtaining responses from the entire target sample due to factors like non-response, contact difficulty, or lack of interest. As such, many studies have advocated for a response of up to 50% or more, while others have noted that a response rate of between 20% 30% can be deemed adequate depending on the nature of the study (Akintoye, 2000; Moser and Kalton, 1999). This current study retrieved responses from 222 construction professionals, representing a response rate of 40.5%. Considering the focus of the study on digitalisation, the need to get respondents with the set number of years of experience, and the minimum threshold for the method of data analysis to be conducted, the gathering of responses was considered adequate to draw a logical conclusion in the study. The questionnaire was sent electronically to the respondents using a Google Form link. A cover letter accompanied the questionnaire to inform the respondents of the objective of the study and its indirect benefit to them. The letter also informed them of anonymity throughout the survey, their right to withdraw from the study at any time without any consequence, and the data storage, use and destroyer process. This was done to ensure ethical collection of data from the respondents. The closed-ended structured questionnaire was designed in two sections. The first section sought to identify the characteristics of the respondents to determine their suitability for the study. The second section assessed the influence of the strategy variables identified from the literature (presented in Table 1) and confirmed through Delphi on the digitalisation of construction organisations in South Africa. A five-point influence scale ranging from five (very large extent) to one (no extent at all) was adopted for this second section.
The data on the background of the respondents were analysed using percentages (%). The data gathered from the second section was first tested for its reliability using the Cronbach alpha (α) test with a 0.7 cut-off. An α-value of 0.781 was derived, thus implying that the data gathered were reliable. Since the data were gathered from construction professionals from contracting, consulting and government organisations, the significant difference in the view of these groups of respondents in rating the variables was determined using the Kruskal-Wallis H-test (K-W) (Pallant, 2020). This test gives a χ2 and a p-value that should be < 0.05 for a significant difference across the different groups. Mean item score () was also used to rank these variables in descending order. In addition, the fourteen variables were regrouped into a more manageable subscale using exploratory factor analysis (EFA) with varimax rotation (Tabachnick and Fidell, 2013). Yong and Pearce (2013) have noted that EFA can determine correlation patterns in a data set and use this relationship to rearrange factors into appropriate groups. Moreover, Pallant (2020) submitted that EFA is one of the most reliable approaches to reducing large variables into more manageable subscales. For EFA to be conducted, the factorability of the data was first tested using Kaiser–Meyer–Olkin (KMO) set at a threshold of 0.6 and Bartlett test of sphericity (BTS), which is expected to be significant at a p-value <0.05 (Pallant, 2020; Tabachnick and Fidell, 2013). Further to the grouping derived from EFA, sensitivity analysis using decision trees and random forest to assess these grouped strategies in order of their importance to construction digitalisation. Using both decision trees and random forests for sensitivity analysis allowed a comprehensive approach by leveraging their complementary strengths. On the one hand, decision trees offer interpretability and insight into decision rules, enabling clear understanding of key factors. On the other hand, random forests enhance robustness and predictive accuracy using ensemble learning, reducing overfitting and capturing complex interactions. Together, both approaches give a reliable, interpretable, and thorough sensitivity assessment, which balances transparency with statistical stability. Before conducting the ML-based sensitivity analysis, the correlation of the variables in the dataset was checked using Spearman correlation (rs), variance inflation factor (VIF), and tolerance to avoid multicollinearity issues. The ML-based sensitivity analysis was conducted using three approaches viz: Decision tree's backward stepwise elimination (BSE), Random forest's mean decrease in impurity (MDI) and permutation feature importance (PI). These three approaches were combined to give a more robust understanding of the most important strategies that construction organisations need to focus on while seeking digital transformation.
4. Result and discussion
4.1 Background information
The respondents consist of architects (15.3%), engineers (26.6%), construction and project managers (26.1%) and quantity surveyors (32%). Most respondents have a bachelor's degree (51.8%), while 27.5%, 17.1% and 3.6% have masters, diploma and doctorates, respectively. In terms of organisation type, these respondents were drawn from contracting (48.6%), consulting (29.3%), and government organisations (22.1%). While 31.5% of these respondents had at least five years of working experience within the South African construction industry, the remaining 68.5% have well above five years of working experience. On average, all the respondents have 9.2 years of working experience, which shows a considerably high number of years in the South African construction industry. Based on the background characteristics of the respondents for the study, it is evident that the questions posed in this research were answered by construction professionals with adequate academic qualifications and professional experience. Therefore, their response can be considered adequate as they are given based on their wealth of experience and understanding of the questions of the research.
4.2 Strategies for enhancing construction digitalisation
4.2.1 Exploratory factor analysis
The result in Table 2 gives the result of the rotated component matrix, the K-W test and the collinearity test conducted on the questionnaire data. The result revealed that four out of the 14 strategies assessed have significant divergence in the view of the construction professionals from the three different groups based on the K-W test conducted. Overall, K-W showed that there is no statistically significant difference in the perspective of the three groups of respondents, as χ2-value of 4.088 and a significant p-value of 0.130 were derived. The overall ranking revealed that all the strategies assessed have the possibility of influencing the digitalisation of construction organisations if implemented, as they all have a of well above the average of 3.0.
Rotated component matrix for the strategies for construction digitalisation
| Strategies | Component | Comm | K-W | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | Extract | χ2 | p-value | Tolerance | VIF | ||
| Component 1 – Understanding the construction market (27.6%) | |||||||||||
| SC5 | 0.775 | 0.708 | 4.15 | 7.450 | 0.024** | 0.477 | 2.098 | ||||
| SC4 | 0.750 | 0.730 | 4.12 | 2.751 | 0.253 | 0.530 | 1.888 | ||||
| SC10 | 0.692 | 0.660 | 3.82 | 9.564 | 0.008** | 0.559 | 1.789 | ||||
| Component 2 – Creating a digital culture (12.4%) | |||||||||||
| SC13 | 0.810 | 0.683 | 4.84 | 3.142 | 0.208 | 0.676 | 1.479 | ||||
| SC14 | 0.796 | 0.686 | 4.81 | 14.877 | 0.001** | 0.586 | 1.707 | ||||
| SC12 | 0.509 | 0.716 | 4.59 | 2.371 | 0.306 | 0.427 | 2.345 | ||||
| Component 3 – Technology deployment and performance assessment (10.3%) | |||||||||||
| SC2 | 0.722 | 0.534 | 4.60 | 3.638 | 0.162 | 0.719 | 1.391 | ||||
| SC9 | 0.640 | 0.531 | 4.69 | 1.920 | 0.383 | 0.676 | 1.478 | ||||
| SC1 | 0.626 | 0.753 | 4.77 | 3.524 | 0.172 | 0.654 | 1.529 | ||||
| SC3 | 0.503 | 0.494 | 4.77 | 3.288 | 0.193 | 0.664 | 1.505 | ||||
| Component 4 – Communication Management (8.2%) | |||||||||||
| SC6 | 0.747 | 0.627 | 4.77 | 9.568 | 0.008** | 0.749 | 1.334 | ||||
| SC11 | 0.655 | 0.548 | 4.77 | 3.288 | 0.193 | 0.660 | 1.516 | ||||
| Component 5 – Finance (7.3%) | |||||||||||
| SC7 | 0.895 | 0.843 | 4.83 | 4.916 | 0.086 | 0.708 | 1.412 | ||||
| SC8 | 0.561 | 0.712 | 4.50 | 3.843 | 0.146 | 0.436 | 2.295 | ||||
| KMO Measure of Sampling Adequacy | KMO | 0.689 | |||||||||
| Bartlett's Test of Sphericity | Approx. χ2 | 878.18 | |||||||||
| Df | 91 | ||||||||||
| p-value | 0.000 | ||||||||||
| Strategies | Component | Comm | K-W | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | Extract | χ2 | p-value | Tolerance | VIF | ||
| Component 1 – Understanding the construction market (27.6%) | |||||||||||
| SC5 | 0.775 | 0.708 | 4.15 | 7.450 | 0.024** | 0.477 | 2.098 | ||||
| SC4 | 0.750 | 0.730 | 4.12 | 2.751 | 0.253 | 0.530 | 1.888 | ||||
| SC10 | 0.692 | 0.660 | 3.82 | 9.564 | 0.008** | 0.559 | 1.789 | ||||
| Component 2 – Creating a digital culture (12.4%) | |||||||||||
| SC13 | 0.810 | 0.683 | 4.84 | 3.142 | 0.208 | 0.676 | 1.479 | ||||
| SC14 | 0.796 | 0.686 | 4.81 | 14.877 | 0.001** | 0.586 | 1.707 | ||||
| SC12 | 0.509 | 0.716 | 4.59 | 2.371 | 0.306 | 0.427 | 2.345 | ||||
| Component 3 – Technology deployment and performance assessment (10.3%) | |||||||||||
| SC2 | 0.722 | 0.534 | 4.60 | 3.638 | 0.162 | 0.719 | 1.391 | ||||
| SC9 | 0.640 | 0.531 | 4.69 | 1.920 | 0.383 | 0.676 | 1.478 | ||||
| SC1 | 0.626 | 0.753 | 4.77 | 3.524 | 0.172 | 0.654 | 1.529 | ||||
| SC3 | 0.503 | 0.494 | 4.77 | 3.288 | 0.193 | 0.664 | 1.505 | ||||
| Component 4 – Communication Management (8.2%) | |||||||||||
| SC6 | 0.747 | 0.627 | 4.77 | 9.568 | 0.008** | 0.749 | 1.334 | ||||
| SC11 | 0.655 | 0.548 | 4.77 | 3.288 | 0.193 | 0.660 | 1.516 | ||||
| Component 5 – Finance (7.3%) | |||||||||||
| SC7 | 0.895 | 0.843 | 4.83 | 4.916 | 0.086 | 0.708 | 1.412 | ||||
| SC8 | 0.561 | 0.712 | 4.50 | 3.843 | 0.146 | 0.436 | 2.295 | ||||
| KMO Measure of Sampling Adequacy | KMO | 0.689 | |||||||||
| Bartlett's Test of Sphericity | Approx. χ2 | 878.18 | |||||||||
| Df | 91 | ||||||||||
| p-value | 0.000 | ||||||||||
Note(s): Extraction Method: Principal Component Analysis
Rotation Method: Varimax with Kaiser Normalization
χ2 = Chi-square, Df = degree of freedom
** = significant @ <0.05
In conducting EFA, the factorability of the data was first determined using KMO and BTS. In Table 2, the derived KMO value of 0.689 is above the cut-off of 0.6 for data to be factorable. More so, the BTS gave a χ2 of 878.18 and a significant p-value of 0.000, thus confirming the factorability of the data. These KMO and BTS results, coupled with the study's sample size and the reliability of the data confirmed by the α-value of 0.781, show that the data for this study is ideal for EFA to be conducted. Five principal components with eigenvalues above 1.0 were extracted, and they account for 65.8% of the cumulative variance. The first principal component extracted accounts for the highest percentage variance of 27.6% with three variables. Based on the latent similarity of these variables and following the suggestions of Williams et al. (2010) that the naming of an extracted factor must be theoretical and follow the researcher's understanding of existing literature, this component was named “understanding the construction market”. The second component accounts for 12.4% of the total variance with three variables and was named “creating a digital culture”. The third component accounts for 10.3% of the total variance and has four variables loading on it. It was named “technology deployment and assessment”. The fourth and fifth components both have two variables each and account for 8.2% and 7.3% of the total variance, respectively. Subsequently, these components were named “communication management”, and “finance” based on the nature of their variables.
4.2.2 Collinearity analysis
For ML to be applied to the data gathered, collinearity, which is the non-independence of model predictors, was first assessed. Wilcox (2022) described it as the close relatedness of two or more predictors. According to Daoud (2017), a dataset is said to possess multicollinearity when two or more predictors are highly interrelated and, as such, lead to an increase in the standard error. Also, Dormann et al. (2013) noted that in multivariate analysis, the existence of multicollinearity in a dataset would lead to increased variance of the predictors, which in turn will lead to difficulty in identifying the relevant predictors in the model. To this end, the Spearman correlation (rs), VIF and tolerance were adopted to test the multicollinearity as suggested in past studies (Pallant, 2020). The rs can be calculated using equation (1), and the closer the result is to +1 the more perfect the relationship between the variables. Based on the result in Figure 1, the rs derived ranged from −0.1 to 0.5, which falls within the cut-offs for multicollinearity. Also, Mojtahedi (2015) noted the importance of assessing tolerance, which depicts the magnitude of the variance of a variable that is not properly predicted by other variables. The VIF and tolerance can be derived using equations (2) and (3).
The square correlation matrix heatmap has a horizontal axis labeled from left to right as “S C 1”, “S C 2”, “S C 3”, “S C 4”, “S C 5”, “S C 6”, “S C 7”, “S C 8”, “S C 9”, “S C 10”, “S C 11”, “S C 12”, “S C 13”, and “S C 14”. The vertical axis is labeled from top to bottom as “S C 14”, “S C 13”, “S C 12”, “S C 11”, “S C 10”, “S C 9”, “S C 8”, “S C 7”, “S C 6”, “S C 5”, “S C 4”, “S C 3”, “S C 2”, and “S C 1”. A color scale is displayed on the right side of the heatmap, ranging from negative 0.1000 in pale yellow at the bottom to positive 1.0000 in black at the top, with intermediate shades transitioning through yellow, orange, red, purple, and dark purple indicating medium strength correlations. The entries in the matrix are as follows. The square correlation matrix heatmap has a horizontal axis labeled from left to right as “S C 1”, “S C 2”, “S C 3”, “S C 4”, “S C 5”, “S C 6”, “S C 7”, “S C 8”, “S C 9”, “S C 10”, “S C 11”, “S C 12”, “S C 13”, and “S C 14”. The vertical axis is labeled from top to bottom as “S C 14”, “S C 13”, “S C 12”, “S C 11”, “S C 10”, “S C 9”, “S C 8”, “S C 7”, “S C 6”, “S C 5”, “S C 4”, “S C 3”, “S C 2”, and “S C 1”. A color scale is displayed on the right side of the heatmap, ranging from negative 0.1000 at the bottom in pale yellow to positive 1.0000 at the top in black, with yellow, orange, and red depicting mid-range correlations, and purple and dark purple representing higher values. The entries in the matrix are as follows: S C 14: S C 1: 0.2, S C 2: 0, S C 3: 0.3, S C 4: 0.1, S C 5: 0.2, S C 6: 0.1, S C 7: 0.1, S C 8: 0.2, S C 9: 0, S C 10: 0.1, S C 11: 0.2, S C 12: 0.4, S C 13: 0.4, and S C 14: 1. S C 13: S C 1: 0.2, S C 2: 0.2, S C 3: 0.1, S C 4: 0, S C 5: 0, S C 6: negative 0.1, S C 7: 0.1, S C 8: 0.1, S C 9: 0.1, S C 10: 0.1, S C 11: 0.2, S C 12: 0.3, S C 13: 1. S C 12: S C 1: 0, S C 2: 0.2, S C 3: 0.4, S C 4: 0.3, S C 5: 0.4, S C 6: 0.3, S C 7: 0.2, S C 8: 0.5, S C 9: 0.3, S C 10: 0.5, S C 11: 0.4, S C 12: 1. S C 11: S C 1: 0.2, S C 2: 0.1, S C 3: 0.2, S C 4: 0.2, S C 5: 0.3, S C 6: 0.3, S C 7: 0.2, S C 8: 0.3, S C 9: 0.2, S C 10: 0.2, S C 11: 1. S C 10: S C 1: negative 0.1, S C 2: 0, S C 3: 0, S C 4: 0.3, S C 5: 0.5, S C 6: 0.2, S C 7: negative 0.1, S C 8: 0.4, S C 9: 0.1, S C 10: 1. S C 9: S C 1: 0.3, S C 2: 0.3, S C 3: 0.2, S C 4: 0.3, S C 5: 0.1, S C 6: 0.1, S C 7: 0.2, S C 8: 0.4, S C 9: 1. S C 8: S C 1: 0.2, S C 2: 0.3, S C 3: 0.4, S C 4: 0.3, S C 5: 0.4, S C 6: 0.3, S C 7: 0.4, S C 8: 1. S C 7: S C 1: 0.1, S C 2: 0.2, S C 3: 0.2, S C 4: 0.1, S C 5: 0.1, S C 6: 0.2, S C 7: 1. S C 6: S C 1: 0.1, S C 2: 0.1, S C 3: 0, S C 4: 0.2, S C 5: 0.3, S C 6: 1. S C 5: S C 1: 0, S C 2: 0, S C 3: 0.2, S C 4: 0.5, S C 5: 1. S C 4: S C 1: negative 0.1, S C 2: 0.2, S C 3: 0.3, S C 4: 1. S C 3: S C 1: 0.1, S C 2: 0.1, S C 3: 1. S C 2: S C 1: 0.3, S C 2: 1. S C 1: S C 1: 1.Correlation matrix testing multicollinearity. Source: Source: Authors’ own work (2024)
The square correlation matrix heatmap has a horizontal axis labeled from left to right as “S C 1”, “S C 2”, “S C 3”, “S C 4”, “S C 5”, “S C 6”, “S C 7”, “S C 8”, “S C 9”, “S C 10”, “S C 11”, “S C 12”, “S C 13”, and “S C 14”. The vertical axis is labeled from top to bottom as “S C 14”, “S C 13”, “S C 12”, “S C 11”, “S C 10”, “S C 9”, “S C 8”, “S C 7”, “S C 6”, “S C 5”, “S C 4”, “S C 3”, “S C 2”, and “S C 1”. A color scale is displayed on the right side of the heatmap, ranging from negative 0.1000 in pale yellow at the bottom to positive 1.0000 in black at the top, with intermediate shades transitioning through yellow, orange, red, purple, and dark purple indicating medium strength correlations. The entries in the matrix are as follows. The square correlation matrix heatmap has a horizontal axis labeled from left to right as “S C 1”, “S C 2”, “S C 3”, “S C 4”, “S C 5”, “S C 6”, “S C 7”, “S C 8”, “S C 9”, “S C 10”, “S C 11”, “S C 12”, “S C 13”, and “S C 14”. The vertical axis is labeled from top to bottom as “S C 14”, “S C 13”, “S C 12”, “S C 11”, “S C 10”, “S C 9”, “S C 8”, “S C 7”, “S C 6”, “S C 5”, “S C 4”, “S C 3”, “S C 2”, and “S C 1”. A color scale is displayed on the right side of the heatmap, ranging from negative 0.1000 at the bottom in pale yellow to positive 1.0000 at the top in black, with yellow, orange, and red depicting mid-range correlations, and purple and dark purple representing higher values. The entries in the matrix are as follows: S C 14: S C 1: 0.2, S C 2: 0, S C 3: 0.3, S C 4: 0.1, S C 5: 0.2, S C 6: 0.1, S C 7: 0.1, S C 8: 0.2, S C 9: 0, S C 10: 0.1, S C 11: 0.2, S C 12: 0.4, S C 13: 0.4, and S C 14: 1. S C 13: S C 1: 0.2, S C 2: 0.2, S C 3: 0.1, S C 4: 0, S C 5: 0, S C 6: negative 0.1, S C 7: 0.1, S C 8: 0.1, S C 9: 0.1, S C 10: 0.1, S C 11: 0.2, S C 12: 0.3, S C 13: 1. S C 12: S C 1: 0, S C 2: 0.2, S C 3: 0.4, S C 4: 0.3, S C 5: 0.4, S C 6: 0.3, S C 7: 0.2, S C 8: 0.5, S C 9: 0.3, S C 10: 0.5, S C 11: 0.4, S C 12: 1. S C 11: S C 1: 0.2, S C 2: 0.1, S C 3: 0.2, S C 4: 0.2, S C 5: 0.3, S C 6: 0.3, S C 7: 0.2, S C 8: 0.3, S C 9: 0.2, S C 10: 0.2, S C 11: 1. S C 10: S C 1: negative 0.1, S C 2: 0, S C 3: 0, S C 4: 0.3, S C 5: 0.5, S C 6: 0.2, S C 7: negative 0.1, S C 8: 0.4, S C 9: 0.1, S C 10: 1. S C 9: S C 1: 0.3, S C 2: 0.3, S C 3: 0.2, S C 4: 0.3, S C 5: 0.1, S C 6: 0.1, S C 7: 0.2, S C 8: 0.4, S C 9: 1. S C 8: S C 1: 0.2, S C 2: 0.3, S C 3: 0.4, S C 4: 0.3, S C 5: 0.4, S C 6: 0.3, S C 7: 0.4, S C 8: 1. S C 7: S C 1: 0.1, S C 2: 0.2, S C 3: 0.2, S C 4: 0.1, S C 5: 0.1, S C 6: 0.2, S C 7: 1. S C 6: S C 1: 0.1, S C 2: 0.1, S C 3: 0, S C 4: 0.2, S C 5: 0.3, S C 6: 1. S C 5: S C 1: 0, S C 2: 0, S C 3: 0.2, S C 4: 0.5, S C 5: 1. S C 4: S C 1: negative 0.1, S C 2: 0.2, S C 3: 0.3, S C 4: 1. S C 3: S C 1: 0.1, S C 2: 0.1, S C 3: 1. S C 2: S C 1: 0.3, S C 2: 1. S C 1: S C 1: 1.Correlation matrix testing multicollinearity. Source: Source: Authors’ own work (2024)
where di is the difference between the two ranks of each observation, n is the number of observations, and R2 is the unadjusted coefficient of determination.
Table 2 also shows the derived VIF ranges from 1.334 to 2.345, while the tolerance value ranges from 0.427 to 0.749. According to Hair et al. (2010), an ideal VIF should be below 5 with a tolerance above 0.10 for multicollinearity not to exist within a dataset. Based on the result derived, coupled with the result from the rs it can be deduced that the dataset is not affected by multicollinearity.
4.2.3 ML-based sensitivity analysis
The use of ML classification algorithms is gaining traction within survey research for categorical outcomes (Patel and Rana, 2014) because many of these algorithms are not overly dependent on the distributional assumptions of most traditional methods (Buskirk et al., 2018). ML is mostly used to predict future outcomes or determine the importance of input variables in developing a model (Li et al., 2022). This study mainly focuses on the latter by using ML to determine the important construction digitalisation grouped strategies. Sensitivity analysis determines the relative importance of variables that make up a model (Li et al., 2022). In this study, sensitivity analysis was done to determine the most important group of strategies required by construction organisations in their quest for digitalisation. This test became necessary since no clear-cut criteria exist for choosing one strategy over another. The decision tree and random forest were the ML algorithms used in conducting this sensitivity analysis. A decision tree was used to identify and rank the important strategies by using the backward stepwise elimination (BSE) method. The random forest algorithm was used by applying the mean decrease in impurity (MDI) and permutation feature importance (PI) methods. The results of these three techniques (i.e. BSE, MDI, and PI) were compared to establish the importance of the identified strategies. By using these three techniques, more robust and reinforced findings on how each grouped strategy will affect digital construction transformation were established.
4.2.4 Sensitivity analysis using decision Tree's backward stepwise elimination
The BSE is a robust sensitivity analysis method in which a variable is excluded from a model, and its impact on the model's output is evaluated (Li et al., 2022). This implies that an ML model must be developed before variable exclusion is done. A decision tree classification algorithm was adopted for this analysis since it is one of the most popular classification approaches, which uses the learning of simple decision rules inferred from a dataset to predict the value of an outcome (Gupta, 2020). Decision tree divides the input data into subsets based on one or more attributes, and this process is repeated until an appropriate number of finer subsets is established (Hafeez et al., 2021). This algorithm is easy to understand, effective for grouping purposes and can allow variable selection while reducing the effort required for data preparation (Jijo and Abdulazeez, 2021).
In conducting this analysis, the dataset was split into two, with 70% used for training while the remaining 30% was used to test the model's accuracy. Furthermore, a 5-fold cross-validation was done to prevent overfitting and hyperparameter tuning. Bayesian optimisation at 30 iterations was used for hyperparameter tuning. Tuning the model using this approach helped to achieve an optimal solution increase in efficiency, speed, and accuracy. The model was evaluated using a confusion matrix and overall accuracy metrics. This confusion matrix is a cross-tabulation of the actual value of the expected outcome (true class) for every sample against the values of the predicted level (predicted class) of the expected outcome for every sample in the dataset. These accuracy metrics are determined by four major components namely; True Positive (TP), False Negative (FN), False Positive (FP) and True Negative (TN). For clarity, the TP shows the number of cases wherein the model correctly predicted the positive target variable (Yes). FN indicates the number of cases wherein the model predicted a positive (Yes) target variable as negative (No). FP shows the number of cases wherein the model predicted a negative (No) target variable as positive (Yes). Lastly, TN shows the number of times the model correctly predicted a negative (No) target variable. By calculating the ratios of these values using equations, 4 to 7, the accuracy of each prediction can be determined.
The confusion matrix of the decision tree model is presented in Figure 2, with the blue boxes showing instances where the predictions were accurate. The figure shows that the model predicted correctly (i.e. TPR 66.1%) that these strategies will, to a very large extent, lead to construction digitalisation. It is important to assess the confusion matrix along with the receiver operating curve (ROC) to determine the accuracy of a model (Aghimien et al., 2022). The ROC shows the result of the TPR and FPR, which is created by analysing a dual classifier system (Carter et al., 2016). In interpreting this graph in Figure 3, the area under the curve (AUC) was assessed. The value of AUC ranges from 0 to 1, with AUC close to 1 depicting a more accurate model. The result reveals an AUC of 0.8033 (i.e. 80.3%), which indicates a very good predictive accuracy. To confirm the accuracy of the classification model, the overall classification variable was assessed using equation (8). The result showed that the model had an overall accuracy of 63% on the trained data and an overall accuracy of 67% for the test data. This finding shows the model could give predictions with reasonably good accuracy.
The square confusion-matrix heatmap has a horizontal axis labeled “Predicted Class” and markings from left to right as “2”, “3”, “4”, and “5”. The vertical axis is labeled “True class” and markings from top to bottom as “2”, “3”, “4”, and “5”. A second vertical heatmap is displayed on the right side, divided into two columns labeled at the bottom as “T P R” and “F N R”. All heatmaps use a color gradient scale ranging from blue for higher percentages to red for lower percentages, with white indicating values near zero. The entries in the main matrix are as follows: Class 2: Predicted 2: 40.0 percent, Predicted 3: Blank, Predicted 4: 20.0 percent, and Predicted 5: 40.0 percent. Class 3: Predicted 2: Blank, Predicted 3: 36.4 percent, Predicted 4: 36.4 percent, and Predicted 5: 27.3 percent. Class 4: Predicted 2: Blank, Predicted 3: 1.2 percent, Predicted 4: 65.4 percent, and Predicted 5: 33.3 percent. Class 5: Predicted 2: 3.4 percent, Predicted 3: Blank, Predicted 4: 30.5 percent, and Predicted 5: 66.1 percent. The entries in the right-side heatmap are as follows: Class 2: T P R: 40.0 percent, F N R: 60.0 percent. Class 3: T P R: 36.4 percent, F N R: 63.6 percent. Class 4: T P R: 65.4 percent, F N R: 34.6 percent. Class 5: T P R: 66.1 percent, F N R: 33.9 percent.Confusion matrix from the decision tree model. Source: Source: Authors’ own work (2024)
The square confusion-matrix heatmap has a horizontal axis labeled “Predicted Class” and markings from left to right as “2”, “3”, “4”, and “5”. The vertical axis is labeled “True class” and markings from top to bottom as “2”, “3”, “4”, and “5”. A second vertical heatmap is displayed on the right side, divided into two columns labeled at the bottom as “T P R” and “F N R”. All heatmaps use a color gradient scale ranging from blue for higher percentages to red for lower percentages, with white indicating values near zero. The entries in the main matrix are as follows: Class 2: Predicted 2: 40.0 percent, Predicted 3: Blank, Predicted 4: 20.0 percent, and Predicted 5: 40.0 percent. Class 3: Predicted 2: Blank, Predicted 3: 36.4 percent, Predicted 4: 36.4 percent, and Predicted 5: 27.3 percent. Class 4: Predicted 2: Blank, Predicted 3: 1.2 percent, Predicted 4: 65.4 percent, and Predicted 5: 33.3 percent. Class 5: Predicted 2: 3.4 percent, Predicted 3: Blank, Predicted 4: 30.5 percent, and Predicted 5: 66.1 percent. The entries in the right-side heatmap are as follows: Class 2: T P R: 40.0 percent, F N R: 60.0 percent. Class 3: T P R: 36.4 percent, F N R: 63.6 percent. Class 4: T P R: 65.4 percent, F N R: 34.6 percent. Class 5: T P R: 66.1 percent, F N R: 33.9 percent.Confusion matrix from the decision tree model. Source: Source: Authors’ own work (2024)
The horizontal axis is labeled “False Positive Rate” and ranges from 0 to 1 in increments of 0.1 units. The vertical axis is labeled “True Positive Rate” and ranges from 0 to 1 in increments of 0.1 units. A legend in the upper left region contains two entries: a blue line labeled “Optimisable tree classifier R O C curve (A U C equals 0.8033)” and a red circular marker labeled “Optimisable tree classifier Operating Point”. The curve begins at the coordinate (0, 0) and passes through the points (0.01, 0.4), (0.02, 0.6), (0.1, 0.8), (0.74, 0.8), and ends at (1, 1). A single red data point marking the operating point appears at the coordinate (0.01, 0.4). A text annotation in the center of the plot below the curve reads “Area under curve equals 0.803”. Note: All numerical data values are approximated.Receiver operating curve from decision tree model. Source: Source: Authors’ own work (2024)
The horizontal axis is labeled “False Positive Rate” and ranges from 0 to 1 in increments of 0.1 units. The vertical axis is labeled “True Positive Rate” and ranges from 0 to 1 in increments of 0.1 units. A legend in the upper left region contains two entries: a blue line labeled “Optimisable tree classifier R O C curve (A U C equals 0.8033)” and a red circular marker labeled “Optimisable tree classifier Operating Point”. The curve begins at the coordinate (0, 0) and passes through the points (0.01, 0.4), (0.02, 0.6), (0.1, 0.8), (0.74, 0.8), and ends at (1, 1). A single red data point marking the operating point appears at the coordinate (0.01, 0.4). A text annotation in the center of the plot below the curve reads “Area under curve equals 0.803”. Note: All numerical data values are approximated.Receiver operating curve from decision tree model. Source: Source: Authors’ own work (2024)
Following the development of the decision tree model, a BSE sensitivity analysis was performed. In conducting the BSE, each group of strategies were excluded one after the other from the model, and the accuracy of the overall model was assessed to see the importance of the removed group. The lower the overall accuracy derived, the more important the eliminated group. Findings show that the five grouped strategies are important at varied levels, as they all dropped below the overall accuracy of 67% previously derived. However, more focus should be given to variables relating to technology deployment and assessment (TDA) as well as understanding the construction market (UCM), as the exclusion of these groups saw the highest drop in the model accuracy to 59% and 60.3%, respectively. Creating a digital culture (CDC) and finance (FIN) led to a drop of 62.8% from the initial 67% derived, while the group with the least impact was communication management, with a drop of 64.7%.
4.2.5 Sensitivity analysis using random Forest's mean decrease in impurity and permutation feature importance
Random forest is a tree ensemble method that reduces the model's bias and variance by growing several decision trees (Deng et al., 2018). Within random forest, a tree is trained on a bootstrap dataset created through random sampling, and this process is repeated several times until a forest of trees emerges (Sarica et al., 2017). Generally, random forest performs better than most tree-based models (Mienye et al., 2019). Also, it is not prone to over-fitting and is suitable for handling noisy data. Equation (9) presents random forest's general mathematical function.
where x is the vectored input parameter, c is the number of trees, and Ti(x) is a single regression tree based on a subset of inputs and the bootstrapped samples.
In this study, the two commonly used random forest methods originally provided by Breiman (2001), i.e. MDI and PI, were used. MDI describes the sum of the gains attributed to all splits carried out along a given variable or covariate (Benard et al., 2022). This method is fast and calculates the results based on the Gini importance (Saarela and Jauhiainen, 2021). Generally, in conducting PI, a specific variable in the datasets is shuffled (in this case each group at a time), and the difference between the error of the permuted and original datasets is computed (Benard et al., 2022). Generally, variables with a large PI are termed the strong independent predictors of the output (Loef et al., 2022). Over the last few years, various studies have indicated the flaws of both methods. For example, unlike PI, MDI shows bias in the presence of correlated data. Nevertheless, PI is regarded as the most effective feature importance method (Benard et al., 2022). This study used both methods for a more robust comparison, and the result is presented in Figure 4. The feature importance score of each strategy was summed up to determine its importance, and this method was carried out for all five groups. As presented, understanding the construction market (UCM) ranked first and close to it was technology deployment and assessment (TDA) in both cases. Also, creating a digital culture (CDC) and finance (FIN) ranked third and fourth for the MDI method. But this was reversed in the PI method as finance (FIN) and creating a digital culture (CDC) ranked third and fourth, respectively. Lastly, communication management (CM) ranked the least for both methods.
The left bar graph shows “Mean decrease impurity” and the right bar graph shows “Permutation importance”. Both graphs have the horizontal axis labeled “Predictors” and markings from left to right as follows: “U C M”, “C D C”, “T D A”, “C M”, and “F I N”. The vertical axis is labeled “Feature importance” and ranges from 0.00 to 0.35 in increments of 0.05 units. For “Mean decrease impurity”, the chart shows the following data: U C M: 0.30. C D C: 0.14. T D A: 0.30. C M: 0.09. F I N: 0.14. For “Permutation importance”, the chart shows the following data: U C M: 0.19. C D C: 0.06. T D A: 0.18. C M: 0.04. F I N: 0.14. Note: All numerical data values are approximated.Sensitivity analysis of grouped digital construction strategies using MDI and PI approaches. Source: Source: Authors’ own work (2024)
The left bar graph shows “Mean decrease impurity” and the right bar graph shows “Permutation importance”. Both graphs have the horizontal axis labeled “Predictors” and markings from left to right as follows: “U C M”, “C D C”, “T D A”, “C M”, and “F I N”. The vertical axis is labeled “Feature importance” and ranges from 0.00 to 0.35 in increments of 0.05 units. For “Mean decrease impurity”, the chart shows the following data: U C M: 0.30. C D C: 0.14. T D A: 0.30. C M: 0.09. F I N: 0.14. For “Permutation importance”, the chart shows the following data: U C M: 0.19. C D C: 0.06. T D A: 0.18. C M: 0.04. F I N: 0.14. Note: All numerical data values are approximated.Sensitivity analysis of grouped digital construction strategies using MDI and PI approaches. Source: Source: Authors’ own work (2024)
4.2.6 Summary and deductions from sensitivity analysis using BSE, MDI and PI methods
Table 3 gives a summary of the grouped strategies ranking from all three sensitivity analyses employed in the study. When compared, UCM ranked first in both the MDI and PI and second in BSE. Similarly, TDA ranked first in the BSE and second in both the MDI and PI. The findings from these three methods further prove that for the construction digitisation to occur within organisations, UCM and TDA must be considered, as these two strategies play a key role compared to other strategies. Furthermore, CDC ranked third in BSE and MDI but fourth in PI, while FIN ranked fourth in BSE and MDI but third in PI. It can be deduced that these two grouped strategies have a lesser influence on digitisation transformation compared to UCM and TDA. However, in the case of proposing models or implementing digitalisation in the construction industry, any of these two grouped strategies can be used alongside UCM and TDA as predictors. Lastly, CM ranked least for all methods. Hence, all methods validate that CM is not an important group strategy for digital transformation. This implies that lesser attention and resources should be placed on CM when compared to the rest.
Summary of grouped strategies ranking from all three SA methods
| Ranking of strategies | ||||
|---|---|---|---|---|
| S/N | Grouped strategies | BSE | MDI | PI |
| 1 | Technology deployment and assessment (TDA) | 1 | 2 | 2 |
| 2 | Understanding the construction market (UCM) | 2 | 1 | 1 |
| 3 | Creating a digital culture (CDC) | 3 | 3 | 4 |
| 4 | Finance (FIN) | 4 | 4 | 3 |
| 5 | Communication management (CM) | 5 | 5 | 5 |
| Ranking of strategies | ||||
|---|---|---|---|---|
| S/N | Grouped strategies | BSE | MDI | PI |
| 1 | Technology deployment and assessment (TDA) | 1 | 2 | 2 |
| 2 | Understanding the construction market (UCM) | 2 | 1 | 1 |
| 3 | Creating a digital culture (CDC) | 3 | 3 | 4 |
| 4 | Finance (FIN) | 4 | 4 | 3 |
| 5 | Communication management (CM) | 5 | 5 | 5 |
4.3 Discussion and implications of findings
Based on the study's findings, the strategies identified from the literature are applicable to the attainment of construction digitalisation in South Africa as indicated by the experts from the Delphi study. EFA grouped these strategies into (1) understanding the construction market, (2) creating a digital culture, (3) technology deployment and assessment, (4) communication management and (5) finance. Following these groupings, ML-based sensitivity analysis revealed that these identified strategies have high predictive accuracy of attaining digitalisation within organisations adopting them. However, further scrutiny of the data through BSE, MDI and PI sensitivity analysis has shown that construction organisations seeking to be digitalised should first put effort into understanding the construction market as well as technology deployment and assessment. This result from sensitivity analysis revealed some divergence from existing studies on digitalisation strategies that have placed emphasis on organisations digitising their products and services, customers' experience as some of the core strategies required (Dyk and Schutted, 2012; Newman, 2017; Sprokholt et al., 2018). This divergence in the findings can be attributed to the scope as this current study was explored from the construction industry in a developing country perspective.
The implication of the above findings can be seen in two folds. On one hand, the result provides guidance for large construction organisations with adequate resources to adopt all five groups of strategies at once to achieve digitalisation. On the other hand, the findings have revealed the major areas wherein construction organisations with limited resources (mostly small and medium organisations) can start off their digitalisation journey. Adopting a piecemeal approach, through first understanding the construction market and then determining the right technology to deploy, while putting measures in place to assess the performance of these technologies, is vital, as observed from the sensitivity analysis. This approach becomes crucial, especially for the construction industry in developing countries (South Africa inclusive), which is driven majorly by small and medium organisations (Aghimien et al., 2021). Ensuring that the digitalisation agenda is not limited to only large organisations with readily available resources, but also creates direction for these small and medium organisations, is essential.
Understanding the construction market requires construction organisations to select their target market, have a clear project and service boundary and avoid decision errors through digital technologies. Interestingly, following previous studies, it was anticipated that investment in technology and skills would emerge as crucial, considering the nature of the construction industry, where adoption of emerging technologies is slow (Oke et al., 2018). However, ML revealed that understanding the market is crucial before selecting and investing in technologies. The selection of the target market and the creation of a clear boundary for the project and services delivered by construction organisations can be shaped by investment in R&D. As noted by Teece (2007), a solid investment in R&D offers organisations the opportunity to understand their environment and unearth opportunities. For small and medium organisations with limited resources to invest in forward-thinking R&D, carefully analysing areas of improvement and strategically investing in R&D in the identified areas can help streamline the use of resources. Moreover, collaboration with other organisations with complementary resources can prove useful (Ortega-Argilés et al., 2009). Haas (2015) has also noted the need for organisations to pay careful attention to their environment and sense opportunities that could help shape their selection of targeted clients. In the context of developing countries like South Africa, many construction organisations tend to handle several different jobs in the quest to stay afloat and survive the harsh construction environment. The implication of this involvement in diverse construction jobs means that these organisations will need to invest in digital technologies needed to deliver these different services. However, with careful observation of the construction market, organisations can adopt fewer technologies and perfect their usage in delivering their services to their clients.
In ensuring technology deployment and assessment, organisations should be able to forecast the needs of clients using suitable digital tools, carefully select digital technologies suitable for their business and meet their clients' needs, promote and invest in forward-thinking R&D, and create means to measure the performance of their digital uptake. These findings are in tandem with the submission of Quinton et al. (2018), who noted the need for organisations to be able to forecast the needs of their customers. As construction continues to evolve, it is important for organisations to pay attention to the changing needs of their clients and put digital tools in place to help meet these needs. The use of digital technologies like cloud computing and big data analytics offers these organisations a platform to store and retrieve information on client needs and analyse this data to make effective decisions that will help ensure client satisfaction and, in the process, obtain competitiveness for the organisation. Similarly, R&D plays a key role in ensuring construction organisations adopt the right technology for their business. While the construction industry in developing countries like South Africa has been notorious for its lack of investment in forward-thinking R&D (Rust and Koen, 2011), it can no longer be business as usual for organisations that wish to satisfy their clients' future needs, derive value for themselves and their clients and at the same time gain competitiveness through effective digitalisation.
Another aspect worth noting is the need to create a digital culture and invest in carefully selected digital technologies and the required skills. It has been noted that the construction industry in developing countries like South Africa is notorious for its poor digital culture (Ikuabe et al., 2020). If organisations within this industry are to be digitalised, attention should be paid to cultivating a culture that promotes digital technology uptake. Evidently, the transition from the traditional methods of service delivery to a digital form can be seen as a risk, especially when the traditional approaches have been deemed adequate and when there is little practical evidence of the success of these digital uptakes within the immediate environment (Vass and Gustavsson, 2017). To this end, top management and owners of construction organisations need to put measures in place to evaluate and mitigate the risk associated with the selected digital tools. Using adaptive experimentation whereby organisations invest in smaller digital technology projects just to gain a reasonable understanding of the inherent risks can prove helpful (Quinton et al., 2018). While creating this digital culture, care must also be given to the financial investment in not just the technologies selected but also the people who will use these technologies. Investment in key technologies such as building information modelling, which has been deemed at the centre of the digitalisation of construction, is crucial. Other technologies, such as the Internet of Things, big data analytics, drones, sensors, and augmented and virtual realities, promise immense benefits to construction organisations and are worth careful assessment and investment. However, the skills required for these technologies should also be considered adequately. This is important as past studies have noted that an organisation's digitalisation does not solely depend on acquiring these technologies but also on having the right skills to handle these technologies and drive digital change (Frankiewicz and Chamorro-Premuzic, 2020). For a construction industry like South Africa, which has been characterised by a skills shortage, particularly in relation to technical skills to handle digital technologies (Oke et al., 2018), organisations seeking to be digitally transformed must invest in their workforce to effectively integrate selected digital technologies into their business functions.
5. Conclusion
The current digitalisation discourse in diverse industries has transcended from whether to adopt digital technologies to how these emerging technologies should be adopted and integrated into existing organisational functions to gain the benefits thereof. Considering the slow-paced adoption of digital technologies within construction, this study explored the strategies required for construction digitalisation using South Africa as a case study. The study concludes that construction organisations can attain digitalisation by considering five groups of strategies vis, understanding the construction market, technology deployment and assessment, creating a digital culture, finance and effective communication management. The study's findings offer practical strategies for enhancing the digitalisation of construction organisation. While some organisations might be able to adopt all five groups of at a time, others with limited resources can adopt a piecemeal approach by first considering the first two major groups of strategies (i.e. understanding the construction market as well as technology deployment and assessment) identified via sensitivity analysis.
Theoretically, the study provides a foundation for future studies on the strategies for construction digitalisation, which is an aspect that has received little attention in the current discourse on construction digitalisation particularly in developing countries like South Africa. In addition, the study makes a methodological contribution to the use of ML classification for survey studies within the construction industry. As the use of ML is gradually gaining recognition in survey research, the approach adopted in this study can serve as a guide for future researchers seeking to adopt the same method within the construction domain. Despite the aforementioned contributions, it is essential to note that the study's findings are limited to the South African construction industry. Further studies are encouraged in other developing countries, especially in Africa, where such studies are scarce. Also, the study focused on what construction organisations need to do to effectively use the available wide range of technologies and innovations to improve their businesses and competitiveness. The study did not place emphasis on a specific technology or approach. As such, further studies are recommended on the strategies for implementing specific technologies and their implication for digitalisation in the South African construction industry.

