Hybrid surrogate models of quay walls

Cerek, Kacper; Klos, Dagmara; Hadjiloo, Elnaz; Grabe, Jürgen

doi:10.1680/jcien.25.00451

The use of hybrid surrogate modelling techniques for the life-cycle management of anchored quay walls was investigated. A synthetic dataset was generated using a simplified structural model based on classical earth pressure theory, representing a range of geometrical and hydraulic boundary conditions. Two neural network (NN) architectures were developed and compared: (a) a baseline feedforward neural network (FNN) using static input features and (b) a hybrid model combining bidirectional long short-term memory (BiLSTM) layers with dense layers (BiLSTM–FNN), which incorporates sequential displacement data. Both models were tuned across multiple trials with varying architectures, activation functions and learning rates. The final architectures were deployed in supervised learning to train surrogate models. The BiLSTM–FNN model outperformed the FNN, achieving significantly lower validation loss and superior predictive accuracy, but at a higher computational cost. This modelling approach provides an effective tool for estimating internal structural forces such as maximum bending moments, thereby supporting predictive maintenance and optimised design. The results demonstrate the potential of hybrid NN architectures within digital twin frameworks for port infrastructure, contributing to enhanced resilience and more efficient resource use.

Notation

E: Young’s modulus of steel
EI: bending stiffness of sheet pile profile
H: total height of quay wall
I: moment of inertia
M_max: maximum bending moment of quay wall
N: number of data samples
R²: coefficient of determination
t: embedment depth of quay wall
u_x,y: horizontal displacement along quay wall
W_a: water level on excavated side
$X_{1, i}$: sequential input array for recurrent layers
$x_{k, i}$: static input feature for dense layers
$x_{y, i}^{R}$: sequential input variable in $X_{1, i}$
$\bar{y}$: mean of actual values $y_{1, i}$
$y_{1, i}$: actual value of output variable
${\hat{y}}_{1, i}$: predicted value of output variable
z_a: anchorage level of quay wall
$Ω_{predict}$: predicted subset
$Ω_{test}$: testing subset
$Ω_{train}$: training subset
$Ω_{val}$: validation subset

1. Introduction

The importance of sustainability in the civil engineering industry has grown significantly in recent years. Holistic approaches that consider environmental impacts across the entire project life cycle (Kendall et al., 2018; Samuelsson et al., 2024), based on standardised life-cycle assessment (LCA) methodologies (ISO, 2006a, 2006b), are now integrated from the early stages of geotechnical site investigation (Purdy et al., 2022), through the design and construction phases, often leveraging building information modelling approaches (Sanfilippo et al., 2025; Wan et al., 2025), to the maintenance and rehabilitation of existing infrastructure (Duffy et al., 2024).

A key component contributing to the LCA of geotechnical structures is the development of digital twins, which are virtual representations of physical assets maintained across their operational lifespan (Jiang et al., 2021). Developing accurate models that reflect real-world structures remains a significant challenge in geotechnical engineering, largely due to the complex and non-linear behaviour of soils (Atkinson, 2000). While real-time back-analyses using numerical simulations (Jaquès et al., 2024; Tian et al., 2024) and probabilistic methods (Contreras and Brown, 2019) can serve as digital twins, they are often computationally intensive and time-consuming, an important limitation in contexts where rapid decision-making is required to ensure structural safety. Consequently, surrogate modelling has emerged as a promising and efficient alternative for digital twin development in geotechnical applications.

Surrogate modelling is built upon two fundamental principles – the development of computationally efficient models and the preservation of sufficient predictive accuracy (Forrester et al., 2008). The aim is to identify simplified yet representative mappings of complex systems by learning input–output relationships (Queipo et al., 2005). A range of surrogate modelling techniques has been applied in geotechnical engineering. The response surface method has been used for the reliability assessment of reinforced earth structures (Ghazavi and Valinezhad-Torghabeh, 2024; Sarkar and Hegde, 2025; Yu and Bathurst, 2017). Kriging, based on Gaussian process regression, has supported offshore risk evaluations (Vazirizade and Haldar, 2021). Radial basis functions have been used in tunnel modelling (Khaledi et al., 2014) and polynomial chaos expansion has proven useful in assessing the reliability of geosynthetic-reinforced retaining walls under seismic loading (Alhajj Chehade et al., 2023).

Among these, surrogate models based on artificial neural networks (NNs) have demonstrated strong predictive power and versatility, which has driven their growing adoption in civil engineering, despite challenges such as the need for large datasets and the inherent complexity of model interpretability. Their applications include tunnel stability in karst formations (Kovačević et al., 2021), prediction of soil strength parameters (Heshmati et al., 2012), liquefaction analysis around submarine pipelines (Kutanaei and Choobbasti, 2019), liquefaction-induced lateral spreading (Demir and Sahin, 2023), the uplift capacity of strip anchors (Duong et al., 2025), tunnel heading stability (Keawsawasvong et al., 2025), long-term subgrade performance (Deng et al., 2025), settlement prediction above subsurface cavities (Shubham et al., 2024), prediction of deep excavation responses (Tao et al., 2022), embankment consolidation (Tian et al., 2024), mapping displacement of soil nail walls (Liu et al., 2021) and applying long short-term memory (LSTM) networks in slope displacement prediction (Tang et al., 2021). Tao et al. (2024) employed NNs for time-dependent surrogate modelling of braced excavations, utilising the concept of transfer learning. Ruiz López et al. (2024) developed a surrogate model of vertical shafts in London clay. Similarly, Ferrero et al. (2023) investigated surrogate modelling of braced excavations for real-time back-analysis. El-Sekelly et al. (2025) applied explainable artificial intelligence for predicting the strength of bio-cemented sands utilising various machine learning models.

This paper presents a novel hybrid NN-based surrogate modelling framework that integrates static input parameters (e.g. embedment depth, wall geometry and material properties) with sequential data representing a structure’s horizontal displacements. The model architecture leverages a combination of bidirectional long short-term memory (BiLSTM) layers and fully connected dense layers to capture complex dependencies in the input space. The BiLSTM layers can process entire sequences of ordered data simultaneously, making them well suited for spatially dependent sequences such as horizontal displacements along a retaining wall. Unlike feedforward neural network (FNNs) with dense neurons, BiLSTMs can capture dependencies between positions in the sequence, analogous to temporal dependencies in time series, which allows the model to exploit spatial correlations even in long and short sequences. This approach allows the accurate prediction of maximum bending moments in quay walls, facilitating real-time structural assessment. This aligns with the UN Sustainable Development Goals (SDGs) (UN, 2015) (SDG 9: Industry, innovation and infrastructure, SDG 11: Sustainable cities and communities and SDG 12: Responsible consumption and production) by providing a scalable and computationally efficient tool that enhances infrastructure resilience, supports digitalisation in geotechnical engineering and promotes life-cycle extension through data-driven monitoring.

The aim of this work was to evaluate the capabilities and limitations of hybrid NN-based surrogate models using BiLSTMs in accurately predicting the maximum bending moment of a quay wall cross-section. The models use a combination of static input features and sequential horizontal displacement data to reduce the time and cost associated with traditional numerically based back-analysis, which requires extensive model calibration (e.g. soil parameters) to achieve an accurate system response consistent with field data. The proposed approach bypasses these intermediate calibration steps by directly linking measurable quantities to the desired analytical output. The performance of the hybrid models, integrating both dense and recurrent NNs, was compared against a baseline FNN model that relies solely on static inputs to demonstrate the advantages of incorporating sequential data into surrogate modelling.

The theoretical foundations and methodological framework are outlined in Sections 2 and 3, respectively. The results derived from the conducted analyses are discussed in Section 4. Section 5 concludes the paper by summarising key findings, acknowledging limitations and proposing directions for future research and practical implementation.

2. Theory

This section presents the theoretical background, including a description of the quay walls for which surrogate models are trained, the data sampling procedure required for the supervised learning process and the types of NNs.

2.1 Quay wall and structural system

Quay walls are essential infrastructural components in maritime settings, designed to retain soil and withstand forces resulting from earth pressure, hydrostatic loads and operational activities such as mooring and berthing impacts. These structures are generally categorised into gravity walls, cantilevered sheet pile walls and anchored systems, with the selection of a specific type largely governed by local ground conditions, design requirements and economic factors.

Anchored sheet pile quay walls are particularly prevalent due to their structural efficiency and cost-effectiveness. By employing tie rods or ground anchors, these systems are capable of significantly reducing bending moments and horizontal displacements, allowing the wall to sustain higher loads with a slender structural profile (HTG and DGGT, 2024).

For the purpose of surrogate model development, a simplified representation of an anchored quay wall was employed in this study, as shown in Figure 1. The structure was modelled as a horizontally supported beam with Dirichlet boundary conditions applied at the anchor point and the toe of the wall. The depth of the lower support was incrementally adjusted increasing the embedment depth (t) until the shear force at this point approached zero, indicating a state of equilibrium between external loads and passive earth resistance. Lateral soil pressures were computed using Blum’s method for anchored retaining walls (Blum, 1950) and hydrostatic forces were applied to both faces of the wall to reflect variable water levels. A uniform surface surcharge load of 30.0 kN/m² was also introduced to account for operational loading. The soil parameters of the coarse-grained, cohesionless material used to calculate horizontal earth pressure were kept constant for all combinations, comprising an internal friction angle of 30°, a saturated unit weight of 18.0 kN/m³ and an unsaturated unit weight of 10.0 kN/m³. This idealised structural setup enabled consistent and reproducible simulation of internal forces and wall displacements, forming the basis for a comprehensive dataset used in the training and validation of the surrogate models. The computational analysis of the beam was performed using the anaStruct library for Python (Vink, 2016) with internal forces and horizontal displacements calculated and stored at discrete points along the sheet pile profile.

Figure 1.

A schematic diagram shows an anchored sheet pile wall with water level, surcharge load, and embedment depth.

View large Download slide

The schematic diagram shows a vertical sheet pile wall retaining soil with water on the excavated side. Labels include Discretised horizontal displacements u x,y, Wall height H, Water level on excavated side W a, Embedment depth t, Anchor level z a, Sheet pile profile bending stiffness E I, and Surcharge load. The wall extends from ground surface to embedment depth t below the excavation level. A horizontal water surface is shown at W a on the excavated side. A surcharge load is applied at the retained ground surface. An anchor is located at level z a. The wall height is labelled H and discretised horizontal displacements are indicated along the wall length.

Simplified quay wall cross-section used for data sampling

The aim of this study was to demonstrate the application of hybrid NN-based surrogate modelling in geotechnical engineering. To this end, a simplified analytical model was selected to illustrate the approach. While soil non-linearity and measurement noise were not present in the dataset, the simplified model provided a controlled environment to test the capabilities of the hybrid architecture. Importantly, even though the chosen model was computationally light, the study focuses on validating the surrogate methodology itself, which can later be applied to more complex and computationally intensive models where real-time analysis or optimisation would make surrogate modelling essential.

2.2 Data sampling

Developing reliable surrogate models requires a carefully chosen data sampling approach to ensure accurate mapping between input variables and their corresponding outputs (Blondet et al., 2018). Given the relatively low computational expense associated with data generation in this study, a structured grid-based sampling technique was selected. This method discretised the input domain into fixed intervals, creating a uniform and comprehensive distribution of samples across the entire parameter space. The static input features were defined as

embedment depth of quay wall (t): result of iterative calculation (Section 2.1)
anchorage level of quay wall (z_a): from 0.5 m to 2.0 m in 0.1 m steps
water level on the excavated side (W_a): from 2.5 m to 4.5 m in 0.1 m steps
quay wall’s total height (H): from 5.0 m to 10.0 m in 0.5 m steps
Sheet pile Z-profile: selected from {AZ 12–700, AZ 14–700, AZ 18–700, AZ 20–700, AZ 24–700} (ArcelorMittal, 2022).

To incorporate the sheet pile profile into the regression surrogate model, each profile was numerically encoded using its bending stiffness (EI), calculated as the product of the Young’s modulus of steel (E = 210 GPa) and the corresponding moment of inertia (I) of the profile.

The sampling scheme generated a dataset comprising n = 18 480 individual entries, each representing a unique combination of input variables (⁠ $x_{k, i}$ ⁠), where k = 1, …, 5 corresponds to the parameters t, z_a, W_a, H, EI and i = 1, …, N indicated the data sample. Recorded horizontal displacements (u_x_,_y) along the sheet pile were treated as sequential input arrays (⁠ $X_{1, i}$ ⁠) for the BiLSTM components of the network, where each $X_{1, i}$ consisted of sequential input variables (⁠ $x_{y, i}^{R}$ ⁠) representing individual displacement points along the sheet pile. The output feature (i.e. the maximum bending moment M_max) was defined as $y_{1, i}$ and the model’s accuracy was evaluated by comparing the values of $y_{1, i}$ to the predicted outputs (⁠ ${\hat{y}}_{1, i}$ ⁠) stored in the predicted subset $Ω_{predict}$ ⁠. The full dataset, comprising N data samples, was subsequently divided during preprocessing into training, validation and test subsets (⁠ $Ω_{train}$ ⁠, $Ω_{val}$ and $Ω_{test}$ ⁠, respectively) to support model development and performance assessment. Data sampling was performed using known dependencies and leveraging Blum’s method (Blum, 1950). Therefore, an extensive analysis of multi-collinearity and feature importance (Raja et al., 2023) was omitted in this study. However, for more complex datasets including soil non-linearity and field measurements, it is strongly recommended to thoroughly examine the composition and dependencies within the data.

2.3 FNNs

FNNs represent one of the most basic yet foundational architectures within the broader domain of artificial NNs. The conceptual basis for artificial neurons dates back to the work of McCulloch and Pitts (1943), while the first trainable network (the perceptron) was introduced by Rosenblatt (1958). Although this single-layer design demonstrated early promise, it was later shown to be incapable of modelling non-linear relationships (Minsky and Papert, 1988). This limitation was overcome with the introduction of the back-propagation algorithm (Rumelhart et al., 1986), which enabled the training of multi-layer networks, thereby significantly expanding their functional capabilities.

FNNs operate by transmitting information in a single direction, starting from the input layer, passing through one or more hidden layers and finally reaching the output layer. Unlike recurrent architectures, FNNs do not incorporate cycles or feedback, which simplifies their behaviour and training. Each neuron in the network applies a weighted sum to its inputs, transforms the result using an activation function and passes the output forward to the next layer. This architecture is particularly well suited to tasks involving supervised learning, such as classification and regression, where a known target value is associated with each input.

A standard FNN is comprised of three distinct types of layers. The input layer consists of nodes corresponding to the features of the dataset and serves solely as a conduit for data into the network. The hidden layers are responsible for non-linear transformation and abstraction of the input space. Here, each neuron performs a linear combination of inputs followed by a non-linear activation, enabling the network to capture intricate dependencies within the data. Finally, the output layer generates the network’s prediction. In regression tasks, this layer typically contains a single neuron with a linear activation function, allowing it to output a continuous real-valued prediction (Hecht-Nielsen, 1993).

The training of FNNs is an iterative process comprising forward and backward passes. During forward propagation, the input vector is passed through the network to compute a prediction. The loss function then quantifies the deviation from the true value. In the back-propagation step, gradients of the loss with respect to each model parameter are calculated using the chain rule. These gradients are then used to update weights, typically by stochastic gradient descent or one of its variants.

To ensure robust learning and fair evaluation, the dataset is divided into training and testing sets. The model is fitted using the training data, where it learns to associate input features with their corresponding outputs. The testing data, which remains unseen during training, is used to assess the model’s generalisation capability. This split is critical for identifying whether the network suffers from overfitting (where it learns noise rather than patterns) or underfitting (where the model lacks the complexity or training necessary to capture relevant trends in the data). FNNs were used in this work for surrogate modelling of quay walls using non-sequential static input features (⁠ $x_{k, i}$ ⁠).

2.4 BiLSTM NN

LSTM NNs, introduced by Hochreiter and Schmidhuber (1997), are a specialised form of recurrent NNs designed to model sequential data and capture long-range dependencies effectively. Their architecture incorporates memory cells regulated by gating mechanisms (input, forget and output gates) that control information flow, enabling the network to retain relevant information over extended sequences. In LSTM, an activation function controls signal propagation to the next layer, while a recurrent activation function is used within the gating mechanisms, regulating information flow across data steps. This structure addresses the vanishing gradient and exploding gradient problem inherent in conventional recurrent NNs, facilitating stable training and improved learning of temporal correlations. A gradient is the partial derivative of the loss function with respect to the model parameters, which guides updates during training. The gradient applied to each parameter is scaled by a value held in the model’s weight matrices. These problems arise when the back-propagated gradients either grow exponentially large (exploding gradient) or diminish to near zero (vanishing gradient), making it difficult for the model to learn effectively (Staudemeyer and Morris, 2019).

BiLSTM NNs extend the LSTM framework by introducing a second, parallel LSTM layer that processes the input sequence in reverse chronological order (Schuster and Paliwal, 1997). This bidirectional architecture allows the model to access both past and future contexts at each time step, enhancing its capacity to capture temporal dependencies when predictions depend on the entire sequence. The outputs from forward and backward passes are usually concatenated before subsequent processing. Although BiLSTMs increase computational complexity and parameter count, they have demonstrated superior performance in applications such as time-series forecasting, sequence labelling, structural behaviour prediction under temporally varying inputs (Graves and Schmidhuber, 2005) and geotechnical predictions of laboratory test results (Cerek et al., 2024, 2025a) or LSTM-based parameter calibration (Cerek et al., 2025b). This study leveraged BiLSTMs to process sequential input data (⁠ $X_{1, i}$ ⁠) comprising horizontal displacements along the quay wall (u_x_,_y) while merging their outputs with FNN-based components.

2.5 Activation functions

The selection of a suitable activation function for hidden layers plays a critical role in developing an accurate and efficient NN. Non-linear activation functions are employed within neurons to enable the mapping of complex relationships between input and output data. The final output of an NN is ultimately determined by the weighted sum of the activations produced by individual neurons.

In this study, a range of activation functions was investigated as part of the model architecture tuning process, with the aim of identifying optimal choices for both the FNN and the hybrid BiLSTM–FNN architecture. Notably, in the case of the recurrent BiLSTM model, certain activation functions were found to prevent convergence during training, indicating their unsuitability for this type of network. The activation functions evaluated were

rectified linear unit (ReLU) (Parhi and Nowak, 2020)
exponential linear unit (ELU) (Clevert et al., 2015)
sigmoid (Dubey et al., 2022)
hyperbolic tangent (tanh)
softplus (Dugas et al., 2000).

3. Methodology

This section outlines the methodology adopted in this study, encompassing data preprocessing, architecture tuning, the design of final NN configurations and the evaluation metrics used. A synthetically generated dataset served as the basis for extracting input–output pairs required for the supervised training of surrogate models. Two NN architectures were considered: (a) an FNN comprising only dense layers and (b) a hybrid model integrating BiLSTM layers with dense layers (BiLSTM–FNN model).

The training process was iterative, using a randomly selected training subset (⁠ $Ω_{train}$ ⁠) to optimise the model parameters, while a separate validation subset (⁠ $Ω_{val}$ ⁠) was used to monitor performance after each epoch. Once trained, the models were used to predict output values (⁠ ${\hat{y}}_{1, i}$ ⁠) for a dedicated prediction subset (⁠ $Ω_{predict}$ ⁠). These predicted output values were compared against ground-truth values (⁠ $y_{1, i}$ ⁠) drawn from an isolated test set (⁠ $Ω_{test}$ ⁠) not involved in the training process.

The primary objective of this study was to assess whether incorporating sequential input data by way of BiLSTM layers enhanced surrogate model performance compared with a purely feedforward architecture relying solely on static input features. To ensure a fair comparison, both models underwent automated architecture tuning, optimising hyperparameters such as the number and size of layers, activation functions, the learning rate of the optimiser and batch size. Final training was carried out until no further improvements in predictive accuracy were observed. The aim of the evaluation procedure was to quantify performance through a combination of qualitative and quantitative metrics.

3.1 Preprocessing

The dataset was imported from an external file into a structured data frame, with specific columns designated as

static input features (⁠ $x_{k, i}$ ⁠): embedment depth (t), wall length (H), bending stiffness of sheet pile profile (EI), anchor level (z_a) and water level on the excavated side (W_a)
sequential input array (⁠ $X_{1, i}$ ⁠): horizontal displacements along the sheet pile (u_x_,_y)
output variable (⁠ $y_{1, i}$ ⁠): maximum bending moment (M_max).

The dataset was randomly divided into training, validation and testing (⁠ $Ω_{train}$ ⁠, $Ω_{val}$ and $Ω_{test}$ ⁠) using a 60/20/20 split, with each subset containing a corresponding proportion of the total N data samples. To ensure reproducibility, a fixed random seed of 1993 was applied consistently across all major libraries. For the output variable $y_{1, i}$ ⁠, the absolute value of the maximum bending moment |M_max| was used during training. As the sign of the maximum bending moment was consistent throughout the dataset, the use of absolute values did not affect the learning process.

Although data scaling is known to improve training efficiency (Sinsomboonthong, 2022), it was intentionally omitted in this study. Scaling would have introduced unnecessary complications when applying the model to out-of-range datasets, due to the need for extrapolating scaled values. Additionally, scaling should be based only on training data, since normalisation on test data as well would leak information and inflate the predictive accuracy (Kumar, 2025). This extrapolation can lead to inaccuracies and operational limitations, particularly when the magnitudes of horizontal displacements (u_x_,_y) differed significantly from those observed in the training data. The dense neurons in the surrogate models used the ReLU-based activation function (linear for positive values), so no saturation could occur. The exploding of weighted sum due to large magnitudes of input variables was mitigated through weight compensation within the training, enhanced by the small learning rates of the optimiser.

3.2 Architecture

For the architecture tuning of hyperparameters in both the FNN and BiLSTM-FNN surrogate models, the KerasTuner library (O’Malley et al., 2019) was employed, utilising the Hyperband tuner (Li et al., 2018) owing to its efficiency in combining random search with early stopping criteria. Tables 1 and 2 summarise the hyperparameters adjusted during the tuning process, along with the corresponding search ranges.

Table 1.

Search ranges and hyperparameters explored during architecture tuning of FNN-based surrogate model

Hyperparameter	Search range/options	Description
Activation function	ReLU, tanh, ELU, softplus	Activation function for hidden dense layers
Number of dense layers	1–3	Number of dense hidden layers
Dense neurons per layer	16–256 (step 16)	Number of neurons in each dense layer
Learning rate	1 × 10⁻⁵ to 1 × 10⁻¹	Learning rate for Adam optimiser
Batch size	{16, 32, 64, 128}	Number of samples processed together in one training step

Hyperparameter	Search range/options	Description
Activation function	ReLU, tanh, ELU, softplus	Activation function for hidden dense layers
Number of dense layers	1–3	Number of dense hidden layers
Dense neurons per layer	16–256 (step 16)	Number of neurons in each dense layer
Learning rate	1 × 10⁻⁵ to 1 × 10⁻¹	Learning rate for Adam optimiser
Batch size	{16, 32, 64, 128}	Number of samples processed together in one training step

Table 2.

Search ranges and hyperparameters explored during architecture tuning of hybrid BiLSTM-FNN-based surrogate model

Hyperparameter	Search range/options	Description
Dense activation function	ReLU, tanh, ELU, softplus	Activation function for hidden dense layers
Number of dense layers	1–3	Number of dense hidden layers
Dense neurons per layer	16–256 (step 16)	Number of neurons in each dense layer
Activation function for static branch	ReLU, tanh, ELU, softplus	Activation function for static input in the layer merging recurrent and static branches
BiLSTM activation function	tanh, ReLU, ELU, sigmoid, softplus	Activation function for shaping unit output
BiLSTM recurrent activation function	tanh, ELU, sigmoid, softplus	Activation function used in gating mechanism
Number of BiLSTM layers	1–3	Number of BiLSTM layers
BILSTM units per layer	16–256 (step 16)	Number of units in each BiLSTM layer
Learning rate	1 × 10⁻⁵ to 1 × 10⁻¹	Learning rate for Adam optimiser
Batch size	{16, 32, 64, 128}	Number of samples processed together in one training step

Hyperparameter	Search range/options	Description
Dense activation function	ReLU, tanh, ELU, softplus	Activation function for hidden dense layers
Number of dense layers	1–3	Number of dense hidden layers
Dense neurons per layer	16–256 (step 16)	Number of neurons in each dense layer
Activation function for static branch	ReLU, tanh, ELU, softplus	Activation function for static input in the layer merging recurrent and static branches
BiLSTM activation function	tanh, ReLU, ELU, sigmoid, softplus	Activation function for shaping unit output
BiLSTM recurrent activation function	tanh, ELU, sigmoid, softplus	Activation function used in gating mechanism
Number of BiLSTM layers	1–3	Number of BiLSTM layers
BILSTM units per layer	16–256 (step 16)	Number of units in each BiLSTM layer
Learning rate	1 × 10⁻⁵ to 1 × 10⁻¹	Learning rate for Adam optimiser
Batch size	{16, 32, 64, 128}	Number of samples processed together in one training step

An early stopping strategy was implemented to monitor the validation loss, quantified by the mean squared error (MSE) (Chicco et al., 2021). Training was halted if no improvement greater than 1.0 was observed over ten consecutive epochs. The maximum number of training epochs during tuning was set to 50.

Figure 2 shows the evolution of the validation loss function across all trials conducted for the FNN-based surrogate model. The tuning process comprised 90 trials, with the final trial yielding the best performance with a validation loss function of 3567.7. The optimised architecture consists of a single dense layer with 32 neurons, employing the softplus activation function, a learning rate of 0.000593, and a batch size of 128. This configuration results in only 225 trainable parameters, classifying the network as a relatively small and computationally efficient model.

Figure 2.

A graph plots validation loss function versus trial and highlights the best trial out of 90 trials.

View large Download slide

The graph plots validation loss function versus trial. The vertical axis shows Validation loss function from 0 to 60000 and the horizontal axis shows Trial from 0 to 100. The legend includes Validation loss function, Best trial, Minimum loss, and Total number of trials 90. The curve fluctuates between approximately 5000 and 40000 across trials. The best trial is marked at Trial 89 with Validation loss 3567.7. The annotation lists Learning rate 0.000593, Batch size 128, and Dense layer 1 32 neurons softplus.

Validation loss function values across trials for various combinations of hyperparameters in the preliminary study of the architecture for FNN-based surrogate model

Figure 3 shows the results of the tuning process for the hybrid surrogate model combining LSTM and FNN layers. A total of 64 trials were conducted to determine the optimal network architecture. However, 22 of these trials produced NaN (not-a-number) values, indicating a failure to converge, and were consequently excluded from the evaluation. The non-convergence was primarily attributed to the use of unsuitable activation functions, which can result in issues such as exploding or vanishing gradients during training. Trial 54 achieved the lowest validation loss, with a value of 314.8, which was approximately 11 times lower than that of the optimised FNN model. The resulting architecture comprised three BiLSTM layers with 88, 72 and 88 neurons, respectively. Each BiLSTM layer used the sigmoid activation function for signal propagation and tanh as the recurrent activation for updating the cell state (gating mechanism). The static input branch employed the softplus activation function across its 64 neurons. The outputs from the BiLSTM layers and the first dense layer were concatenated and passed to two subsequent dense layers with 168 and 216 neurons, using ELU and ReLU activation functions, respectively. The optimal model configuration included a learning rate of 0.000447 and a batch size of 64. This model required 448 409 trainable parameters, significantly more than FNN model.

Figure 3.

A graph plots validation loss function versus trial for 64 trials with hyperparameter details.

View large Download slide

The graph plots validation loss function versus trial. The vertical axis shows Validation loss function from 0 to 90000 and the horizontal axis shows Trial from 0 to 70. The legend includes Validation loss function, Best trial, Minimum loss, Total number of trials 64, and Number of dropped trials N a N 22. The curve fluctuates between approximately 1000 and 40000 across trials. The best trial is marked at Trial 54 with Validation loss 314.8. The annotation lists Learning rate 0.000447, Batch size 64, L S T M activation sigmoid, L S T M recurrent activation tanh, L S T M layer 1 88 neurons, L S T M layer 2 72 neurons, L S T M layer 3 88 neurons, Activation static branch softplus, Dense layer 1 168 neurons E L U, and Dense layer 2 216 neurons R e L U.

Validation loss function values across trials for various combinations of hyperparameters in the preliminary study of the architecture for BiLSTM-FNN-based surrogate model

The final architectures shown in Figures 4 and 5 were used to train the final surrogate models based on the FNN and the BiLSTM–FNN, respectively. For model compilation, the Adam optimiser (Kingma and Ba, 2014) was employed. No regularisation techniques were incorporated in the model architectures, as no signs of overfitting were observed during training or validation (e.g. performance divergence between the training and validation sets). The MSE was used as the evaluation metric for both the training and validation subsets.

Figure 4.

A diagram illustrates a neural network architecture with input, hidden, and output layers, including static input features, neurons, and softplus activation function.

View large Download slide

The neural network diagram shows Static input features, Input neuron, Dense neuron, Output feature, and Output dense neuron. The input layer includes variables t, z a, W a, H, and E I connected to input neurons labelled x 1,i, x 2,i, x 4,i, and x 5,i. The hidden layer uses softplus activation and contains neurons labelled h 1,1, h 1,2, h 1,31, and h 1,32 with additional intermediate nodes indicated by ellipsis. The output layer contains neuron y hat 1,i connected to output feature M max.

Architecture of FNN-based surrogate model

Figure 5.

A schematic of a hybrid B i L S T M plus F N N network with static and sequential inputs and output M max.

View large Download slide

The schematic shows a hybrid neural network architecture divided into Input layers, Hidden layers, and Output layer. The upper input branch receives Horizontal displacements u x, 1 through u x, y as sequential input neurons labelled X 1, i with intermediate nodes x R 1, i, x R 2, i, up to x R y, i enclosed in a dashed boundary. These sequential inputs pass through three stacked B i L S T M layers with sigmoid slash tanh activations, containing nodes h 1, 1 to h 1, 88, h 2, 1 to h 2, 72, and h 3, 1 to h 3, 88. The lower input branch receives Static input features t, z a, W a, H, E I into fully connected neurons x 1, i to x 5, i followed by a dense layer with softplus activation and nodes h 1, 1 to h 1, 64. Outputs from the B i L S T M branch and the static dense branch merge at concatenate B i L S T M plus F N N and feed into two fully connected layers labelled E L U and R e L U with nodes h 4, 1 to h 4, 168 and h 5, 1 to h 5, 216. The final output neuron y hat 1, i connects to the Output feature M max. A legend identifies Static input features, Horizontal displacements, Dense neuron, B i L S T M unit, Output feature, Input neuron, Sequential input neurons, and Output dense neuron.

Architecture of BiLSTM-FNN-based surrogate model

The supervised learning process was limited to a maximum of 10 000 training epochs, with an early stopping criterion implemented to mitigate overfitting. Training was terminated if the validation loss failed to improve by more than 1.0 over 50 consecutive epochs, in accordance with the recommendations of Prechelt (2002).

The NNs were implemented and executed using Python (Van Rossum and Drake, 2009), utilising the TensorFlow framework (Abadi et al., 2015) and the Keras API (Chollet, 2015). All computational tasks were performed on a workstation equipped with an AMD Ryzen Threadripper PRO 7975WX 32-core CPU at 4.00 GHz, 512 GB RAM and SSD storage. Although the system included an NVIDIA RTX A6000 GPU with 48 GB VRAM, GPU acceleration was not used for training in this study, since the training durations remained within acceptable computational limits.

3.3 Postprocessing

Throughout all training epochs, the MSE for both training and validation subsets was continuously monitored and recorded. The model’s predicted outputs were saved in a structured data frame, alongside their corresponding actual values from the test subset. Evaluation of the performance of the surrogate models was conducted using both qualitative and quantitative metrics, including

the progression of training and validation loss curves over the epochs
a scatter plot comparing the predicted and actual values of the maximum bending moment, accompanied by the coefficient of determination (R²).

The coefficient of determination is a fundamental scale-independent metric used to assess the overall accuracy of regression models, including NN-based surrogate models. As defined in Equation 1, R² quantifies the proportion of variance in the dependent variable that is explained by the model. In Equation 1, $y_{1, i}$ denotes the actual value, ${\hat{y}}_{1, i}$ is the predicted value and $\bar{y}$ is the mean of the actual values. R² values range from 0 to 1, where 1 indicates a perfect prediction and 0 implies that the model performs no better than simply predicting the mean. Within the context of NNs, R² is widely adopted to evaluate how well a model captures the underlying data patterns (Chicco et al., 2021). Additionally, the mean absolute error (MAE) over the test subset $Ω_{test}$ was calculated to provide an engineering critical metric for model predictive capabilities.

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{1, i} - {\hat{y}}_{1, i})}^{2}}{\sum_{i = 1}^{n} {(y_{1, i} - \bar{y})}^{2}}

1

4. Results

The results of this study include the training of the FNN-based surrogate model and a hybrid model combining a BiLSTM layer with dense layers. These results demonstrate the superior predictive accuracy of the hybrid architecture. Figure 6 shows the training and validation loss curves recorded during the training of the FNN model, which reached a minimum validation loss of 1130.4 at epoch 867. Figure 7 shows the corresponding loss curves for the BiLSTM–FNN model. Although the training process appears somewhat unstable, with visible fluctuations, the best-performing model achieved a minimum validation loss of 14.2 at epoch 198. The training trajectory of the FNN model appears more stable than the hybrid model, characterised by a smooth and gradual decline in both training and validation loss curves. This contrast reflects the increased complexity and dynamic learning behaviour associated with the recurrent layers in the hybrid architecture. Regarding computational efficiency, the BiLSTM–FNN model required 2606.41 s to complete training, whereas the FNN model finished in 153.75 s. The purely FNN architecture was about 17 times quicker in supervised learning process, which might be explained by the much lower number of trainable parameters.

Figure 6.

A graph plots loss function versus epoch for training and validation with restored best weight.

View large Download slide

The graph plots Loss function on the vertical axis from 0 to 20000 and Epoch on the horizontal axis from 0 to 1000, with Training loss and Validation loss decreasing from about 14000 at epoch 0 to about 1200 near epoch 900, and a marked point at Epoch 867 with Validation loss 1130.4 labelled Restored best weight.

Loss function curve and restored best weight for the training and validation dataset across epochs for FNN-based surrogate model

Figure 7.

A graph plots loss function versus epoch with marked best validation result at epoch 198.

View large Download slide

The graph plots Loss function on the vertical axis from 0 to 3000 and Epoch on the horizontal axis from 0 to 250, with Training loss and Validation loss fluctuating at higher values before epoch 100 and decreasing towards values below 200 after epoch 150, and a marked point at Epoch 198 with Validation loss 14.2 labelled Restored best weight.

Loss function curve and the restored best weight for the training and validation dataset across epochs for BiLSTM-FNN-based surrogate model

The FNN model achieved reasonable accuracy, with R² = 0.885, as illustrated in Figure 8, which presents a scatter plot comparing the predicted values of the maximum bending moment (⁠ ${\hat{y}}_{1, i}$ ⁠) with the actual values (⁠ $y_{1, i}$ ⁠) from the test subset $Ω_{test}$ ⁠. However, the broad scatter of predicted values around the perfect-fit line highlights the limitations of predictive accuracy of the FNN surrogate model. The MAE of the FNN model was 26.8 kN.m/m. The enhanced performance of the BiLSTM–FNN surrogate model is demonstrated in Figure 9. A near-perfect alignment of the data points along the ideal prediction line indicates excellent predictive capability. The model achieved R² = 0.999 and MAE = 2.7 kN.m/m across 3696 unseen test samples, confirming its high accuracy and strong generalisation performance. The hybrid model was about ten times more accurate than the pure FNN model based on the MAE. Table 3 provides additional error statistics for both models.

Figure 8.

A scatter graph compares predicted and actual output values with R squared equals 0.885.

View large Download slide

The graph plots Predicted output y hat sub i in kilonewton metre per metre on the vertical axis from 0 to 500 against Actual output y sub i in kilonewton metre per metre on the horizontal axis from 0 to 500. The legend includes Comparison of predicted and actual values, Perfect prediction y hat sub i equals y sub i, and Number of data points 3696. A dashed line represents perfect prediction. The data points form an increasing relationship as actual output increases from about 20 to 450, with predicted output increasing from about 0 to 430. The coefficient of determination R squared equals 0.885.

Comparison of predicted (⁠ ${\hat{y}}_{1, i}$ ⁠) and actual (⁠ $y_{1, i}$ ⁠) maximum bending moments and coefficient of determination (R²) for the test subset using the trained FNN-based surrogate model

Figure 9.

A scatter graph compares predicted and actual output values with R squared equals 0.999.

View large Download slide

The graph plots Predicted output y hat sub i in kilonewton metre per metre on the vertical axis from 0 to 500 against Actual output y sub i in kilonewton metre per metre on the horizontal axis from 0 to 500. The legend includes Comparison of predicted and actual values, Perfect prediction y hat sub i equals y sub i, and Number of data points 3696. A dashed line represents perfect prediction. The data points align closely along the increasing diagonal as actual output increases from about 30 to 450, with predicted output increasing from about 30 to 440. The coefficient of determination R squared equals 0.999.

Comparison of predicted (⁠ ${\hat{y}}_{1, i}$ ⁠) and actual (⁠ $y_{1, i}$ ⁠) maximum bending moments and coefficient of determination (R²) for the test subset using the trained BiLSTM-FNN-based surrogate model

Table 3.

Error statistics od FNN and BiLSTM-FNN models

Surrogate model	Mean error: kNm/m	Error variance: kNm/m²	Minimum error: kNm/m	Maximum error: kNm/m	Skewness	RMSE: kNm/m
FNN	−0.28	1119.3	−74.2	125.8	0.67	33.5
BiLSTM-FNN	0.06	13.5	−16.7	22.8	0.89	3.7

The improved performance of the BiLSTM–FNN model is attributed to its ability to effectively capture dependencies within the horizontal displacement sequences, which the purely FNN architecture was unable to model. By processing ordered spatial data through BiLSTM layers, the hybrid architecture yielded a more accurate surrogate model for the maximum bending moment.

The high accuracy of the BiLSTM–FNN hybrid model was critically evaluated. To prevent test data leakage, the dataset was split prior to training, and testing was conducted using the final trained model; therefore, the results were not compromised by prior exposure to test data. The activation functions used within the NN were non-linear mappings from input features to output labels, and no physical equations directly linking inputs to outputs were implemented.

5. Conclusion

This study demonstrates the added value of a hybrid NN-based surrogate model that integrated BiLSTM layers with dense layers to effectively process both static input features and sequential data, such as horizontal displacements measured along a quay wall. The hybrid BiLSTM–FNN model was systematically compared with a conventional FNN model composed solely of dense layers. Through architecture tuning, the optimal configurations of both models were identified. The trained models successfully predicted the maximum bending moment in quay wall structures, with the hybrid model outperforming the FNN model. Specifically, the BiLSTM–FNN surrogate achieved a coefficient of determination (R²) of 0.999, compared with R² = 0.885 for the FNN model, clearly demonstrating its superior predictive accuracy. However, a trade-off between inference time and accuracy should be considered when choosing a surrogate modelling strategy. The purely FNN model was 17 times faster to train but approximately ten times less accurate (based on the MAE) than the hybrid BiLSTM–FNN model.

5.1 Limitations

While the proposed BiLSTM–FNN surrogate model demonstrated high predictive accuracy and computational efficiency, several limitations should be acknowledged.

Firstly, the model was developed and validated using a synthetic dataset generated under controlled boundary conditions. As such, its generalisation to other geotechnical configurations, soil types or real-world field monitoring data remains untested. Additional validation using diverse datasets, including field measurements, would be necessary to confirm the model’s robustness.

Secondly, no uncertainty quantification or sensitivity analysis was performed in this study. While the input variables were selected based on engineering judgement and physical relevance, the absence of formal feature importance analysis limits insights into how individual features influence the model’s output.

5.2 Outlook

Building on the promising results of this study, several directions for future research are proposed. Further validation of the BiLSTM–FNN surrogate modelling approach using field monitoring data from instrumented quay walls for training is crucial to assess its reliability under varying loading and soil conditions. The incorporation of uncertainty quantification is recommended to define confidence bounds for model predictions, thereby establishing a framework for using the proposed approach as a decision support tool through probabilistic assessment. Additionally, feature importance analysis could provide deeper insights into the relative contribution of input variables, offering guidance on potential model simplification or expansion without compromising predictive accuracy. The integration of physical relationships or constraints into the hybrid model may further enhance its robustness and reduce the amount of training data required, which is especially valuable for sites with limited monitoring records.

5.3 Potential applications

The results of this study underscore the importance of incorporating spatial-dependent displacement data alongside static features in geotechnical modelling. The proposed surrogate modelling approach offers a computationally efficient and scalable solution for real-time structural assessment and supervision of quay walls, as well as other geotechnical structures such as embankments, deep excavations and pile foundations. By enabling rapid predictions without the need for repeated numerical simulations, the method supports life-cycle management and contributes to infrastructure sustainability through optimised monitoring and informed decision making. Future research should focus on the integration of such models into digital platforms and life-cycle management systems to enable cost-effective, automated and data-driven geotechnical asset management.

References

Abadi

M

,

Agarwal

A

,

Barham

P

et al. (

2015

)

TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems

.

Cornell University

,

Ithaca, NY, USA

,

https://doi.org/10.5281/zenodo.4724125

.

Google Scholar

Alhajj Chehade

H

,

Guo

X

,

Dias

D

et al. (

2023

)

Reliability analysis for internal seismic stability of geosynthetic-reinforced soil walls

.

Geosynthetics International

30

(3)

:

296

–

314

,

https://doi.org/10.1680/jgein.22.00250

.

Google Scholar

Crossref

ArcelorMittal

(

2022

)

Piling Handbook

.

ArcelorMittal

,

Luxembourg City, Luxembourg

.

Atkinson

JH

(

2000

)

Non-linear soil stiffness in routine design

.

Géotechnique

50

(5)

:

487

–

508

,

https://doi.org/10.1680/geot.2000.50.5.487

.

Google Scholar

Crossref

Blondet

G

,

Le Duigou

J

,

Boudaoud

N

and

Eynard

B

(

2018

)

An ontology for numerical design of experiments processes

.

Computers in Industry

94

:

26

–

40

,

https://doi.org/10.1016/j.compind.2017.09.005

.

Google Scholar

Crossref

Blum

H

(

1950

)

Beitrag zur berechnung von bohlwerken

.

Bautechnik

27

(2)

:

45

–

52

(

in German

).

Google Scholar

Cerek

K

,

Dao

DA

,

Hadjiloo

E

and

Grabe

J

(

2024

)

Application of LSTM time series forecasting method for predicting compression curves of soil

. In Proceedings of the 17th Pan American Conference on Soil Mechanics and Geotechnical Engineering & 2nd Latin American Regional Conference of the IAEG, La Serena, Chile.

Google Scholar

Cerek

K

,

Gupta

A

,

Dao

DA

,

Hadjiloo

E

and

Grabe

J

(

2025

a)

Predicting soil stress–strain behaviour with bidirectional long short-term memory networks

.

Machine Learning and Data Science in Geotechnics

1

(1)

:

60

–

77

,

https://doi.org/10.1108/MLAG-08-2024-0007

.

Google Scholar

Crossref

Cerek

K

,

Hadjiloo

E

and

Grabe

J

(

2025

b)

Prediction of soil parameters using long short-term memory neural networks

. In Proceedings of the 5th International Symposium on Frontiers in Offshore Geotechnics (ISFOG2025), Nantes, France,

https://doi.org/10.53243/ISFOG2025-55

.

Google Scholar

Chicco

D

,

Warrens

MJ

and

Jurman

G

(

2021

)

The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation

.

PeerJ Computer Science

7

:

e623

,

https://doi.org/10.7717/peerj-cs.623

.

Google Scholar

Crossref

PubMed

Chollet

F

(

2015

)

A superpower for ML developers

. See Link to A superpower for ML developersLink to the cited article (

accessed

02/02/2026).

Google Scholar

Clevert

DA

,

Unterthiner

T

and

Hochreiter

S

(

2015

)

Fast and accurate deep network learning by exponential linear units (ELUs)

.

arXiv Preprint

,

https://doi.org/10.48550/arXiv.1511.07289

.

Google Scholar

Contreras

LF

and

Brown

ET

(

2019

)

Slope reliability and back analysis of failure with geotechnical parameters estimated using Bayesian inference

.

Journal of Rock Mechanics and Geotechnical Engineering

11

(3)

:

628

–

643

,

https://doi.org/10.1016/j.jrmge.2018.11.008

.

Google Scholar

Crossref

Demir

S

and

Sahin

EK

(

2023

)

Predicting occurrence of liquefaction-induced lateral spreading using gradient boosting algorithms integrated with particle swarm optimization: PSO-XGBoost, PSO-LightGBM, and PSO-CatBoost

.

Acta Geotechnica

18

(6)

:

3403

–

3419

,

https://doi.org/10.1007/s11440-022-01777-1

.

Google Scholar

Crossref

Deng

Z

,

Xu

L

,

Li

Y

et al. (

2025

)

Subgrade cumulative deformation probabilistic prediction method based on machine learning

.

Soil Dynamics and Earthquake Engineering

191

:

109233

,

https://doi.org/10.1016/j.soildyn.2025.109233

.

Google Scholar

Crossref

Dubey

SR

,

Singh

SK

and

Chaudhuri

BB

(

2022

)

Activation functions in deep learning: a comprehensive survey and benchmark

.

Neurocomputing

503

:

92

–

108

,

https://doi.org/10.1016/j.neucom.2022.06.111

.

Google Scholar

Crossref

Duffy

K

,

Gavin

KG

and

Lai

F

(

2024

)

Maximising a foundation’s lifetime through monitoring: a case study from the Port of Rotterdam

.

Paper presented at the 2nd Annual Conference on Foundation Decarbonization and Re-use, Amsterdam, Netherlands

.

Google Scholar

Dugas

C

,

Bengio

Y

,

Bélisle

F

,

Nadeau

C

and

Garcia

R

(

2000

) Incorporating second order functional knowledge for better option pricing. In

Advances in Neural Information Processing Systems

(

Leen

T

,

Dietterich

T

and

Tresp

V

(eds)).

MIT Press

,

Cambridge, MA, USA

, pp.

472

–

478

.

Google Scholar

Duong

NT

,

Lai

VQ

,

Keawsawasvong

S

,

Nguyen

TS

and

Kido

R

(

2025

)

Uplift capacity analysis of inclined strip anchors considering spatial variability of undrained shear strength: RAFELA and ANN

.

Computers and Geotechnics

177

:

106915

,

https://doi.org/10.1016/j.compgeo.2024.106915

.

Google Scholar

Crossref

El-Sekelly

W

,

Raja

MNA

and

Abdoun

T

(

2025

)

Explainable AI for predicting the strength of bio-cemented sands

.

Journal of Rock Mechanics and Geotechnical Engineering

,

https://doi.org/10.1016/j.jrmge.2025.05.029

.

Google Scholar

Ferrero

J

,

Ruiz López

A

,

Taborda

DMG

and

Brasile

S

(

2023

)

Applying the observational method to a deep braced excavation using an artificial neural network

. In

Proceedings of the 10th European Conference on Numerical Methods in Geotechnical Engineering

,

https://doi.org/10.53243/NUMGE2023-303

.

Google Scholar

Forrester

AIJ

,

Sóbester

A

and

Keane

AJ

(

2008

)

Engineering Design via Surrogate Modelling: A Practical Guide

.

Wiley

,

New York, NY, USA

,

https://doi.org/10.1002/9780470770801

.

Google Scholar

Crossref

Ghazavi

M

and

Valinezhad-Torghabeh

N

(

2024

)

Behaviour of geocell reinforced sand supporting footings using response surface method

.

Geotechnical and Geological Engineering

42

(6)

:

5283

–

5299

,

https://doi.org/10.1007/s10706-024-02841-1

.

Google Scholar

Crossref

Graves

A

and

Schmidhuber

J

(

2005

)

Framewise phoneme classification with bidirectional LSTM and other neural network architectures

.

Neural Networks

18

(5–6)

:

602

–

610

,

https://doi.org/10.1016/j.neunet.2005.06.042

.

Google Scholar

Crossref

PubMed

Hecht-Nielsen

R

(

1993

)

Neurocomputing

,

Reprint with Corrections, 5th Printing

.

Addison-Wesley

,

Reading, MA, USA

.

Google Scholar

Heshmati

RA

,

Alavi

AH

,

Keramati

M

and

Gandomi

AH

(

2012

)

A radial basis function neural network approach for compressive strength prediction of stabilized soil

. In Proceedings of Road Pavement Material Characterization and Rehabilitation, pp.

147

–

153

,

https://doi.org/10.1061/41043(350)20

.

Google Scholar

Crossref

Hochreiter

S

and

Schmidhuber

J

(

1997

)

Long short-term memory

.

Neural Computation

9

(8)

:

1735

–

1780

,

https://doi.org/10.1162/neco.1997.9.8.1735

.

Google Scholar

Crossref

PubMed

HTG and DGGT (Hafentechnische Gesellschaft e.V. and Deutsche Gesellschaft für Geotechnik e.V.)

(

2024

)

Recommendations of the Committee for Waterfront Structures: Harbours and Waterways: EAU 2020

. Ernst & Sohn,

Berlin, Germany

.

ISO

(

2006

a) ISO 14040:2006: Environmental management life cycle assessment principles and framework.

ISO

,

Geneva, Switzerland

.

ISO (International Organization for Standardization)

(

2006

b) ISO 14044:2006: Environmental management life cycle assessment requirements and guidelines.

ISO

,

Geneva, Switzerland

.

Jaquès

I

,

Cano

C

,

Llopart

J

,

Solà

B

and

Aliguer

I

(

2024

)

Ground model workflow with DAARWIN

. In

Proceedings of the 7th International Conference on Geotechnical and Geophysical Site Characterization (ISC), Barcelona, Spain

,

https://doi.org/10.23967/isc.2024.028

.

Google Scholar

Crossref

Jiang

F

,

Ma

L

,

Broyd

T

and

Chen

K

(

2021

)

Digital twin and its implementations in the civil engineering sector

.

Automation in Construction

130

:

103838

,

https://doi.org/10.1016/j.autcon.2021.103838

.

Google Scholar

Crossref

Keawsawasvong

S

,

Shiau

J

,

Duong

NT

et al. (

2025

)

Enhancing understanding of 3D rectangular tunnel heading stability in c-φ soils with surcharge loading: a comprehensive FELA analysis using three stability factors and machine learning

.

Artificial Intelligence in Geosciences

6

(1)

:

100111

,

https://doi.org/10.1016/j.aiig.2025.100111

.

Google Scholar

Crossref

Kendall

A

,

Raymond

AJ

,

Tipton

J

and

DeJong

JT

(

2018

)

Review of life-cycle-based environmental assessments of geotechnical systems

.

Proceedings of the Institution of Civil Engineers – Engineering Sustainability

171

(2)

:

57

–

67

,

https://doi.org/10.1680/jensu.16.00073

.

Google Scholar

Crossref

Khaledi

K

,

Miro

S

,

König

M

and

Schanz

T

(

2014

)

Robust and reliable metamodels for mechanized tunnel simulations

.

Computers and Geotechnics

61

:

1

–

12

,

https://doi.org/10.1016/j.compgeo.2014.04.005

.

Google Scholar

Crossref

Kingma

DP

and

Ba

J

(

2014

)

Adam: A method for stochastic optimization

. In

Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA

.

https://doi.org/10.48550/arXiv.1412.6980

.

Google Scholar

Kovačević

MS

,

Bačić

M

and

Gavin

K

(

2021

)

Application of neural networks for the reliability design of a tunnel in karst rock mass

.

Canadian Geotechnical Journal

58

(4)

:

455

–

467

,

https://doi.org/10.1139/cgj-2019-0693

.

Google Scholar

Crossref

Kumar

K

(

2025

)

Deep learning in geotechnical engineering: a critical assessment of PINNs and operator learning

.

arXiv Preprint

,

https://doi.org/10.48550/arXiv.2512.24365

.

Google Scholar

Kutanaei

SS

and

Choobbasti

AJ

(

2019

)

Prediction of liquefaction potential of sandy soil around a submarine pipeline under earthquake loading

.

Journal of Pipeline Systems Engineering and Practice

10

(2)

:

04019002

,

https://doi.org/10.1061/(ASCE)PS.1949-1204.000034

.

Google Scholar

Li

L

,

Jamieson

K

,

DeSalvo

G

,

Rostamizadeh

A

and

Talwalkar

A

(

2018

)

Hyperband: a novel bandit-based approach to hyperparameter optimization

.

Journal of Machine Learning Research

18

:

1

–

52

.

Google Scholar

Liu

D

,

Lin

P

,

Zhao

C

and

Qiu

J

(

2021

)

Mapping horizontal displacement of soil nail walls using machine learning approaches

.

Acta Geotechnica

16

(12)

:

4027

–

4044

,

https://doi.org/10.1007/s11440-021-01345-z

.

Google Scholar

Crossref

McCulloch

WS

and

Pitts

W

(

1943

)

A logical calculus of the ideas immanent in nervous activity

.

The Bulletin of Mathematical Biophysics

5

(4)

:

115

–

133

,

https://doi.org/10.1007/BF02478259

.

Google Scholar

Crossref

Minsky

M

and

Papert

S

(

1988

)

Marvin Minsky and Seymour Papert, Perceptrons

.

MIT Press

,

Cambridge, MA, USA

,

https://doi.org/10.7551/mitpress/4943.003.0015

.

Google Scholar

Crossref

O’Malley

T

,

Bursztein

E

,

Long

J

et al. (

2019

).

KerasTuner

. See Link to KerasTunerLink to the cited article (

accessed

02/02/2026).

Google Scholar

Parhi

R

and

Nowak

RD

(

2020

)

The role of neural network activation functions

.

IEEE Signal Processing Letters

27

:

1779

–

1783

,

https://doi.org/10.1109/LSP.2020.3027517

.

Google Scholar

Crossref

Prechelt

L

(

2002

) Early stopping but when? In

Neural Networks: Tricks of the Trade

(

Orr

GB

and

Müller

KR

(eds)).

Springer

,

Heidelberg, Germany

, pp.

55

–

69

.

Google Scholar

Purdy

CM

,

Raymond

AJ

,

DeJong

JT

et al. (

2022

)

Life-cycle sustainability assessment of geotechnical site investigation

.

Canadian Geotechnical Journal

59

(6)

:

863

–

877

,

https://doi.org/10.1139/cgj-2020-0523

.

Google Scholar

Crossref

Queipo

NV

,

Haftka

RT

,

Shyy

W

et al. (

2005

)

Surrogate-based analysis and optimization

.

Progress in Aerospace Sciences

41

(1)

:

1

–

28

,

https://doi.org/10.1016/j.paerosci.2005.02.001

.

Google Scholar

Crossref

Raja

MNA

,

Jaffar

STA

,

Bardhan

A

and

Shukla

SK

(

2023

)

Predicting and validating the load–settlement behavior of large-scale geosynthetic-reinforced soil abutments using hybrid intelligent modeling

.

Journal of Rock Mechanics and Geotechnical Engineering

15

(3)

:

773

–

788

,

https://doi.org/10.1016/j.jrmge.2022.04.012

.

Google Scholar

Crossref

Rosenblatt

F

(

1958

)

The perceptron: a probabilistic model for information storage and organization in the brain

.

Psychological Review

65

(6)

:

386

–

408

.

Google Scholar

PubMed

Ruiz López

A

,

Taborda

D

,

Tsiampousi

A

et al. (

2024

) Applying machine learning to the development of surrogate models for shafts in clay. In

Geotechnical Engineering Challenges to Meet Current and Emerging Needs of Society

.

CRC Press

,

Boca Raton, FL, USA

, pp.

795

–

798

,

https://doi.org/10.1201/9781003431749-133

.

Google Scholar

Crossref

Rumelhart

DE

,

Hinton

GE

and

Williams

RJ

(

1986

)

Learning representations by backpropagating errors

.

Nature

323

(6088)

:

533

–

536

,

https://doi.org/10.1038/323533a0

.

Google Scholar

Crossref

Samuelsson

I

,

Spross

J

and

Larsson

S

(

2024

)

Integrating life-cycle environmental impact and costs into geotechnical design

.

Proceedings of the Institution of Civil Engineers – Engineering Sustainability

177

(1)

:

19

–

30

,

https://doi.org/10.1680/jensu.23.00012

.

Google Scholar

Crossref

Sanfilippo

R

,

Esfandiari

M

,

Foria

F

et al. (

2025

)

ITA − AITES tunnelling information modelling − a BIM approach for a sustainable life cycle management

.

Tunnelling and Underground Space Technology

165

:

106711

,

https://doi.org/10.1016/j.tust.2025.106711

.

Google Scholar

Crossref

Sarkar

S

and

Hegde

A

(

2025

)

Reliability assessment of steel slag and construction waste backfill for reinforced earth structures using response surface method

.

Soils and Foundations

65

(1)

:

101569

,

https://doi.org/10.1016/j.sandf.2025.101569

.

Google Scholar

Crossref

Schuster

M

and

Paliwal

KK

(

1997

)

Bidirectional recurrent neural networks

.

IEEE Transactions on Signal Processing

45

(11)

:

2673

–

2681

,

https://doi.org/10.1109/78.650093

.

Google Scholar

Crossref

Shubham

K

,

Metya

S

and

Sinha

AK

(

2024

)

Surrogate model-based prediction of settlement in foundation over cavity for reliability analysis

.

Transportation Infrastructure Geotechnology

11

(3)

:

1294

–

1320

,

https://doi.org/10.1007/s40515-023-00329-8

.

Google Scholar

Crossref

Sinsomboonthong

S

(

2022

)

Performance comparison of new adjusted min-max with decimal scaling and statistical column normalization methods for artificial neural network classification

.

International Journal of Mathematics and Mathematical Sciences

2022

(6)

:

1

–

9

,

https://doi.org/10.1155/2022/3584406

.

Google Scholar

Crossref

Staudemeyer

RC

and

Morris

ER

(

2019

)

Understanding LSTM – A tutorial into long short-term memory recurrent neural networks

.

arXiv Preprint

,

https://doi.org/10.48550/arXiv.1909.09586

.

Google Scholar

Tang

L

,

Ma

Y

,

Wang

L

et al. (

2021

)

Application of long short-term memory neural network and prophet algorithm in slope displacement prediction

.

International Journal of Geoengineering Case Histories

6

(4)

:

48

–

66

,

https://doi.org/10.4417/IJGCH-06-04-04

.

Google Scholar

Tao

Y

,

Sun

H

and

Cai

Y

(

2022

)

Predictions of deep excavation responses considering model uncertainty: integrating BiLSTM neural networks with Bayesian updating

.

International Journal of Geomechanics

22

(1)

:

4021250

,

https://doi.org/10.1061/(ASCE)GM.1943-5622.0002245

.

Google Scholar

Crossref

Tao

Y

,

Zeng

S

,

Ying

T

et al. (

2024

)

A deep transfer learning model for the deformation of braced excavations with limited monitoring data

.

Journal of Rock Mechanics and Geotechnical Engineering

17

(3)

:

1555

–

1568

,

https://doi.org/10.1016/j.jrmge.2024.02.048

.

Google Scholar

Crossref

Tian

HM

,

Wang

Y

and

Phoon

KK

(

2024

)

Real-time fusion of multi-source monitoring data with geotechnical numerical model results using data-driven and physics-informed sparse dictionary learning

.

Canadian Geotechnical Journal

61

(11)

:

2535

–

2552

,

https://doi.org/10.1139/cgj-2023-0457

.

Google Scholar

Crossref

UN (United Nations)

(

2015

)

Transforming our world: The 2030 Agenda for Sustainable Development

. See Link to Transforming our world: The 2030 Agenda for Sustainable DevelopmentLink to the cited article (

accessed

02/02/2026).

Van Rossum

G

and

Drake

FL

(

2009

)

Python 3 Reference Manual

.

CreateSpace

,

Scotts Valley, CA, USA

.

Google Scholar

Vazirizade

SM

and

Haldar

A

(

2021

)

A novel risk evaluation procedure using a kriging-based surrogate modeling for offshore structures

.

KSCE Journal of Civil Engineering

25

(7)

:

2603

–

2612

.

Google Scholar

Vink

R

(

2016

)

anaStruct 2D frames and trusses

. See Link to anaStruct 2D frames and trussesLink to the cited article (

accessed

02/02/2026).

Google Scholar

Wan

Q

,

Zhu

Y

,

Ding

H

,

Hu

W

and

Xu

C

(

2025

)

Automated design framework for excavation retaining structures: extending IFC standards and integrating BIM with geotechnical simulation

.

Underground Space

24

:

261

–

282

,

https://doi.org/10.1016/j.undsp.2025.04.007

.

Google Scholar

Crossref

Yu

Y

and

Bathurst

RJ

(

2017

)

Probabilistic assessment of reinforced soil wall performance using response surface method

.

Geosynthetics International

24

(5)

:

524

–

542

,

https://doi.org/10.1680/jgein.17.00019

.

Google Scholar

Crossref

2025

Emerald Publishing Limited

Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at Link to the terms of the CC BY 4.0 licenceLink to the terms of the CC BY 4.0 licence.

Hybrid surrogate models of quay walls

Notation

1. Introduction

2. Theory

2.1 Quay wall and structural system

2.2 Data sampling

2.3 FNNs

2.4 BiLSTM NN

2.5 Activation functions

3. Methodology

3.1 Preprocessing

3.2 Architecture

3.3 Postprocessing

4. Results

5. Conclusion

5.1 Limitations

5.2 Outlook

5.3 Potential applications

References

Email Alerts

Cited By

Hybrid surrogate models of quay walls

Notation

1. Introduction

2. Theory

2.1 Quay wall and structural system

2.2 Data sampling

2.3 FNNs

2.4 BiLSTM NN

2.5 Activation functions

3. Methodology

3.1 Preprocessing

3.2 Architecture

3.3 Postprocessing

4. Results

5. Conclusion

5.1 Limitations

5.2 Outlook

5.3 Potential applications

References

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

Sharing Unavailable