Vibrations induced by external loads play a critical role in the performance and safety of high-speed train bogies. Accurate knowledge of the dynamic forces acting on bogie frames is essential for predicting structural responses, enhancing numerical modelling and planning maintenance more effectively. This study aims to develop and evaluate a comprehensive model-based framework for predicting excitation forces on railway bogies, addressing the challenges posed by forces that are difficult or impossible to measure directly.
The proposed framework integrates multi-body dynamics (MBD) simulations, structural finite element (FE) modelling and machine learning (ML) to estimate the forces acting on the bogie frame of a high-speed train. Firstly, an MBD model of a Chinese high-speed train was established and validated against in-service measurement data, from which realistic time-domain loads acting on the bogie frame can be obtained. Separately, modal dynamic simulations of the bogie frame's FE model were performed with stochastic loading to extract corresponding accelerations over a broad range of dynamic behaviour. These were employed to train an ML model to learn the inverse mapping from structural response to applied forces. For validation, the MBD-derived forces were applied to the FE model to obtain corresponding accelerations, which were then used to assess the ML model’s ability to reconstruct the original forces.
The approach can successfully predict nonlinear excitation forces acting on the bogie frame. Based on modal system responses to randomized force inputs, the entire parameter space can be represented, and the trained ML model demonstrates a strong capability to estimate dynamic loads from validated MBD simulations. Through appropriate training, the method exhibits robustness against noise and sensor placement and opens new opportunities for improving the analysis of track–vehicle interaction and the dynamic modelling of bogies.
The approach depends on the accuracy of the validated MBD and FE models, meaning modelling assumptions and simplifications may introduce errors in the predicted forces. High-fidelity in-service measurements are required for model validation but are not always available. Purely simulation-based models enable the prediction of forces and load distributions at the bogie, but the results are strongly model-dependent. Even if the models are validated against reference data, they only reflect an idealized operating condition, and uncertainties in measurement parameters, model parameters, damping behaviour or contact models can significantly affect the accuracy of force predictions.
This research introduces a novel, integrated framework for indirect bogie force estimation that enhances both modelling accuracy and practical diagnostic capability in railway engineering. By integrating numerical simulations, in-service measurements and ML, the study advances current methodologies for analysing high-speed railway vehicles. The approach offers valuable potential for refining vehicle models, guiding maintenance strategies and informing future research on data-driven structural force prediction.
1. Introduction
In high-speed trains, wheel–rail interactions create significant forces that are transmitted through the suspension system, resulting in dynamic loads acting on critical components such as the wheelset and the bogie frame. An accurate prediction of these forces is essential for ensuring structural integrity, ride comfort and operational safety. However, this is challenging under real-world conditions, where factors such as wheel wear, track conditions, varying speeds and passenger loads affect the generation of excitation forces. By operational data, more accurate predictions of these forces can be obtained, allowing the service life of components to be estimated more precisely and potential failures to be anticipated, thereby improving train reliability and minimizing maintenance downtime (Lu, Xiang, Dong, Zhang, & Zeng, 2018; Guo et al., 2025).
Although numerical models and laboratory tests provide valuable insights into the dynamic forces acting on high-speed trains, they cannot fully replicate real operating conditions, as complex non-linear effects and varying environmental conditions are not completely captured. Therefore, extensive in-service measurements are often conducted. Therein, direct and indirect measurement methods can be distinguished. Direct methods for determining the resulting forces rely on the measurement of force quantities. For example, an instrumented wheelset records the forces via the bending of the wheelset shafts and requires exact positioning of the strain gauges as well as calibration on the respective bogie (Wu, Zhang, Wang, We, & Huo, 2023). Indirect methods, on the other hand, infer in-service forces from measurements of other state quantities, such as acceleration or displacement on the car body or bogie, using mathematical methods (Ji, Gao, Liu, & Yang, 2025; Bulduk & Metin, 2025). A general overview of mathematical methods in structural dynamics is provided by Friswell (2008), focussing on model updating and the optimal placement of sensors and selection of parameters. These methods rely on linking measured strains or accelerations to applied loads and solving the system equations.
For the direct measurement of forces on the bogie, exact positioning of the strain gauges and optimized signal processing are crucial in order to minimize interference caused by high dynamic effects. In addition, this method necessitates calibration on the respective bogie to ensure precise and reliable measurement results. Long-term applications, however, are often subject to error due to temperature influences, material fatigue and long-term drift (Wu et al., 2023; Ji et al., 2025). Due to these constraints on direct force measurement, strain gauge-based methods are often impractical, leading to the development of indirect approaches. In these methods, excitation forces are inferred from measurable system states and reconstructed by solving an inverse dynamics problem. This requires accurate system models and time-dependent analyses, as even small errors in the input data can cause large deviations in reconstructed forces (Chen & Li, 2000; Uhl & Petko, 2004). Among others, these methods are employed in applied vehicle dynamics to detect track irregularities based on measured accelerations during the operation of rail vehicles (Czop, Mendrok, & Uhl, 2011; Zhu, Xiao, Guangwu, Ma, & Zhang, 2013). To improve measurement accuracy, some studies emphasize precise laboratory calibration of instrumentation (Nieminen, Tuohineva, & Autio, 2023; Wu et al., 2023). To further improve the robustness and accuracy of load identification and to reduce measurement noise and drift effects, filter methods are often used (Ghibaudo, Aucejo, & de Smet, 2022; Choi, 2023). These methods have contributed to a significant improvement in the measurement accuracy of the resulting forces.
For excitation load reconstruction, Wang and Sheng (2020) presented a method based on the wavelet transform for a high-speed train bogie. The time domain signals at unmeasured locations are reconstructed by summarizing the reconstructed modal responses. Subsequently, the loads applied to the bogie are determined based on impulse response and inverse modal transformation matrices. He et al. (2016) used modal decomposition to reconstruct the strain on complex components. In this approach, the strains are transferred to the critical components based on modal models, and the service life is determined. Frequency domain methods, such as those reviewed by Dobson and Rider (1990), reconstruct forces using the system’s frequency response function. However, these methods encounter computational limitations and instability for transient signals due to the need to invert the frequency-dependent matrices (Liu, Li, Li, & Miao, 2018). Inverse problems are often ill-posed because a solution either does not inherently exist, is not unique or is not stably limited to small changes in the input variable.
More generally, Mendrok and Dworakowski (2019) have summarized various methods for reconstructing the excitation forces. They emphasize that the quality of measurement data is often insufficient for a complete reconstruction of the excitation forces and that classical methods often fail due to the nonlinear characteristics of complex systems. Contributory factors typically include sensor placement, model errors and environmental influences that have not been taken into account in the prediction models.
Furthermore, a wide range of machine learning (ML) methods are being developed for load reconstruction. These methods can approximate complex non-linear mappings without explicit physical modelling and offer robustness against noise and modelling uncertainty (Uhl & Petko, 2004). ML-based methods have seen particular use in estimating dynamic loading caused by time-dependent forces and moments. For instance, neural networks are used to predict the non-linear relationships between flight state and time-dependent forces acting on the rotor blades (Haas, Milano, & Flitter, 1993; Mendrok, Kurowski, & Uhl, 2008; Graziani, Prederi, Trezzini, Favale, & Masarati, 2024) or to solve classical response problems of non-linear structures under seismic excitation (Zhang, Liu, & Sun, 2020). Other applications range from surrogate modelling with physics-informed neural networks for inverse identification problems (Teloli et al., 2024; Chatterjee, Friswell, Adhikari, & Khodaparast, 2024). Comprehensive literature reviews on various applications of ML can be found in Hewamalge et al. (2021) and Tang et al. (2022).
In the field of condition-based monitoring, intelligent systems are used to predict the system state. Leibner, Boehm, Ebbers, Schindler and Pfaff (2025) employed Temporal Convolutional Networks to simultaneously predict track irregularities, wheel–rail forces and vehicle accelerations from the axle-box or car body accelerations. Zhu et al. (2025) combined classical time-domain methods to determine the vertical forces with ML approaches to achieve more accurate estimation of the lateral wheel–rail forces. The latter is significantly more difficult to derive from the system’s state variables due to tribological and nonlinear effects. The ML model is trained on simulated datasets to numerically validate the resulting forces. Axle-box accelerations (Gadhave & Vyas, 2022; Zhang, Wang, Yang, & Zhang, 2025) or suspension system responses (Walther, Müller, Renggli, Ünlü, & Fuerst, 2023) have also frequently been used as inputs for the prediction of wheel–rail forces are predicted using neural networks, as these can be measured with existing sensor systems.
Other approaches rely on measured quantities such as vehicle speed and accelerations to estimate longitudinal forces within the train system. The developed neural networks serve to precisely analyse and monitor these dynamic loads during operation (Zhang, Huang, & Yan, 2024; Liu, Song, Xu, & Yu, 2025). In this context, the simulated forces from validated simulation results are mapped to measured quantities such as speed and acceleration.
For the approximation of the resulting excitation forces, Meethal et al. (2023) demonstrated an approach based on a combination of generic finite element (FE) models and neural networks. The surrogate models are trained to identify the numerical model parameters in order to determine the resulting forces acting on the system. The method uses gradient-based optimization but requires large amounts of training data for a robust solution and shows a clear dependence on the network architecture.
In the field of surrogate models for multi-body dynamics (MBD) simulations, studies similarly demonstrated good agreement between the dynamic responses predicted by surrogate models and MBD simulations (Uhl, Mendrok, & Chudzikiewicz, 2010; Ye, Huang, Sun, & Shi, 2021). In this context, the relationship between system states and disturbances is represented by suitable neural networks. The original system behaviour is largely reproduced by the surrogate models (Ye et al., 2021). For the prediction of the dynamic structural response, determining the resulting forces at the wheel–rail interface alone is not sufficient. Nonlinear dynamic effects, as well as the dynamic interaction due to the structural-mechanical response of flexible components, have a significant influence on the structural behaviour and thus on the expected service life of the system. Data-driven models therefore require extensive training data, particularly because rare operating conditions are often not sufficiently represented. In addition, methods based on sensor data must be regularly validated in order to minimize measurement-related errors.
In this work, an extended approach for generating general training data for the prediction of nonlinear excitation forces is developed. In the frequency domain, the linear FE system can be strongly reduced, allowing many different excitation forces to be solved very efficiently via superposition of eigenmodes, without the need to compute each load case using a full FE simulation. For this purpose, a detailed FE model of the bogie frame is used to simulate the dynamic behaviour and to generate a displacement-based dynamic response database. By applying randomized excitation forces, the dynamic loads acting on the bogie are determined through superposition of the dominant mode shapes. The complex partial differential equations are transformed into decoupled modal equations of motion, reducing computational effort while still enabling the determination of stresses and strains over time based on the modal dynamic response.
Through stochastic excitation in the frequency domain, the entire spectrum of possible excitation forces acting on the bogie is covered. This allows an efficient exploration of a large parameter space and the generation of a substantial amount of training data. These simulations are used to create a dataset linking system displacements with transient excitation forces, which serves as ground truth for model validation.
Subsequently, based on a validated MBD model of a high-speed train, developed in SIMPACK, realistic dynamic responses on a typical Chinese high-speed railway line are simulated. This database serves as input for training a ML model and enables the subsequent prediction of excitation forces from measured structural responses. In this way, validated MBD simulations are combined with structural FE modelling and data-driven ML techniques. This enables the estimation of dynamic forces under operating conditions that are otherwise difficult or very costly to measure experimentally. It is shown that the use of intelligent systems improves the robustness of force estimation with respect to noise and sensitivity to sensor placement. Furthermore, an appropriate selection of sensor positions can considerably reduce the required number of sensors. This makes it possible to determine the resulting design loads on the bogie based on realistic simulations and to analyse the actual loads during operation.
2. Methodology and concept
To predict dynamic forces on a high-speed train bogie during operation, an integrated simulation and machine-learning framework is developed. This framework combines MBD simulations and FE modelling to generate realistic physics-based data for training. Existing simulation tools are specialized for specific tasks, but none fully captures all aspects required for predicting dynamic excitation forces. Therefore, these tools are integrated, along with the necessary interfaces, into a comprehensive validation framework that encompasses all relevant subsystems. The resulting architecture, illustrated in Figure 1, is an extension of the model validation approach presented in Thacker et al. (2004). This framework enables the subsequent usage of ML models for the indirect prediction of forces that are difficult to measure directly, supporting both the validation and refinement of railway vehicle models.
The aim is to link real measurement data with a physically based MBD model to better represent real operating conditions and predict the excitation forces. This approach provides a robust dataset for training ML models, enabling force predictions under realistic operational scenarios. The two main blocks that form the basis of any virtual validation architecture are the physics-based simulation model and the real operating measurement data.
The measurement data component captures real-world operational conditions and sensor readings, including global positioning system (GPS) signals, accelerations and structural strains of selected positions from the modelled system. These data not only represent the actual operating scenario but also serve as the ground truth for model validation. In addition, these data provide the boundary conditions for the simulation, such as track geometry, curvature and cant. The track irregularities are based on stochastic data for Chinese high-speed railways.
The physics-based simulation model shows the interactions between the vehicle, its components and the environment. It receives scenario definitions from the measurement data and generates virtual sensor outputs, including forces, accelerations and kinematic states, based on the modelled system behaviour. These simulated outputs are then compared with the real measurement data to evaluate deviations and iteratively update the model parameters.
In general, these two main blocks do not need to be executed in a coupled manner to reflect real-world behaviour more accurately. Manual adjustment of certain system variables and specific combinations of real environmental variables enable a systematic approach for validation, combining empirical data and physics-based modelling to achieve reliable predictions of complex vehicle dynamic phenomena.
The central idea is that a system that has been successfully validated for a limited number of representative system variables is fundamentally capable of making reliable statements and delivering consistent, robust simulation results for other system variables that have not yet been explicitly tested, such as excitation forces.
Another key component of the framework is the training data generation block for ML, which is designed to generate robust training data for predicting complex system forces. In this block, the training data are generated from stochastic force inputs covering a wide range of possible operating scenarios and system responses. The subsequent modal response analysis makes it possible to efficiently characterize the dynamic behaviour of the system and extract relevant response variables without having to map every possible excitation case completely through numerical simulation. This results in a comprehensive, diversified dataset that represents the complex relationships between input excitation and system responses.
Once the subsystems required for a virtual validation framework have been defined, the corresponding interfaces must also be taken into account in order to be able to conceptually understand the interaction and data exchange between the components. Key interfaces include the transmission of operating conditions, such as speed and route parameters, including track geometry and curve radii, as well as the qualitative comparison of simulated sensor values with real measurement data. Based on the scenario description, the driving dynamics variables required for the validation of the simulation models, such as wheel-rail forces, accelerations and other system-relevant input variables, are derived, ensuring a realistic and physically consistent representation of vehicle behaviour. At the same time, the transfer of the measurement data into modelled sensor values enables the identification of discrepancies between modelled and actual signals, allowing the simulation model to be iteratively adjusted and continuously improved.
3. Multi-body dynamics model
The vehicle under investigation is a high-speed electric multiple unit (EMU) operating in China, which is used for long-distance passenger transport and typically consists of eight cars. For the purpose of this study, however, the MBD model, given in Figure 2, is limited to three cars. The MBD vehicle model used to evaluate the validation method has been set-up using the SIMPACK software environment.
The simulation model represents the high-speed EMU in detail, covering the dynamic interactions between the car body, bogie frame, non-linear wheel-rail contact and suspension components. Based on the wheel and rail profile, the SIMPACK Wheel-Rail module automatically calculates the contact geometry. The rail profile used is the CN60N and the wheel profile used is the LMB10, respectively. This results in an equivalent conicity of 0.1, corresponding to the condition of new wheelsets.
The model accounts for all relevant translational and rotational degrees of freedom, enabling realistic simulation of the bogie and car body responses to track irregularities and excitation forces. The primary and secondary suspension are represented by non-linear spring-damper elements, whose stiffness and damping characteristics are realistically determined based on design data and experimental measurements.
In addition to the primary spring and secondary air spring, the bogie is equipped with an anti-roll bar, yaw and lateral damper. The connection to the car body is made via a lemniscate joint. The suspension systems have been set up with linear elastic force elements with constant stiffness and damping values, whereas the lateral bumpstop and yaw damper are modelled with non-linear spring-damper characteristics.
As this study does not involve significant changes in the operating speed of the train, detailed longitudinal modelling is not required. Consequently, the intercar connections are simply modelled by linear point-to-point suspension elements.
The leading and trailing cars are designated as trailer cars, while the middle car functions as a motor car. Each carriage was modelled as a rigid body and the corresponding moments of inertia were adapted to the lumped mass model. In addition to the car body structures, each wheelset, control arm, bogie frame and bolster beam are all modelled as separate rigid bodies. The bogies as well as the wheelsets are connected via the aforementioned suspension elements. The rigid representation captures the essential physical behaviour of the train cars while simplifying the complexity of the model for efficient analysis. The eigenmodes of the car body structure and the corresponding natural frequencies are given in Table 1 alongside those of the bogie frame. They result from the combination of the mass, stiffness and inertia properties of the car body and bogie frame.
The frequencies listed in the table include both translational and rotational motions of the vehicle, such as vertical translation (bounce), pitching about the transverse axis (pitch), lateral translation (sway) and yawing about the vertical axis (yaw). The longitudinal modes of the configuration have not been analysed in further detail.
4. MBD model validation
The MBD model was initially developed using the nominal design parameters and validated against roller rig results. However, there was no previous consideration of on-track test data. Subsequently, the original model was adjusted with the aim of improving the alignment between on-track test results and simulation outputs. For this study, a curved track based on a realistic high-speed line was used, with track irregularities generated according to the Chinese high-speed rail ballastless track spectrum reported by Kang et al. (2014). The track used in the simulations represents realistic high-speed conditions and is based on a section of ballastless China railway track system (CRTS) II track between Beijing and Shanghai, given in Figure 3. Therein, both straight and curved sections, including cant, were considered to capture realistic vehicle–track interaction.
The upper part of Figure 3 shows the longitudinal and transverse track positions, which determine the curve geometry along the section, with curve radii ranging between 8 and 11 km. The lower part illustrates the cant distribution along the track length. This track section was selected for the mostly constant operating speed of the train during data collection. These data provide a complete representation of the track geometry and are used as a reference scenario for subsequent simulations of the dynamic vehicle response. The simulation results have been validated against measurement data collected at the middle of the car body’s left side beam to assess the dynamic behaviour under operational conditions. These measurements, obtained as part of an internal testing campaign, are used here solely for model validation; details of the measurement setup are not disclosed.
Figure 4 shows the measured lateral accelerations of the high-speed train car body, compared to numerical simulations for both constant-speed and GPS-adjusted variable-speed conditions. The acceleration time histories indicate only minor differences between constant-speed and variable-speed simulations. This suggests that, for the considered track curvature and vehicle model, vertical and lateral car body dynamics are relatively insensitive to small speed variations. Nevertheless, incorporating the actual speed profile is important for capturing transient effects such as acceleration and deceleration over gradients and curves. Subtle discrepancies between measured and simulated accelerations can highlight limitations in the vehicle model, track representation or damping parameters, all of which are essential for accurate ride comfort and structural load predictions.
The validation of the multi-body simulation model demonstrated that the general dynamic behaviour of the vehicle can be reproduced with good accuracy when compared to experimental measurements. Adjustments to key parameters, such as vehicle speed, further improved the agreement between simulated and measured lateral accelerations, particularly with respect to vibration amplitudes. While such results indicate that the model captures the essential dynamics of the system, the evaluation process still relies to some extent on engineering judgement when interpreting the level of agreement between simulation and measurement. Overall, the model captures the essential system dynamics, providing confidence in its applicability for structural loads.
The generated dataset has been used in the current section for validation of the MBD model. The training of the ML model is carried out exclusively using data obtained from the structural FE simulations, ensuring a clear separation between the physical validation dataset and the data used for model training.
5. Finite element modelling for data generation
A detailed FE model has been developed to represent the structural dynamics of the system. The model was validated against experimental modal test results to ensure the consistency of the modelled eigenfrequencies and mode shapes with those of the actual structure. Based on the validated model, dynamic simulations under randomized excitation were performed to generate an extensive dataset for subsequent analysis.
A hybrid FE model of the bogie was developed, in which the wheelset axles were represented as rigid beams and integrated with the primary suspension to satisfy boundary conditions. The bogie frame was modelled using linear shell (2D) elements for the boxed beam structure, in accordance with international institute of welding (IIW) guidelines (Hobbacher, 2016). Geometrically complex forged components were modelled with solid (3D) elements and connected to the frame using Abaqus tie constraints.
For training dataset generation, the natural frequencies and mode shapes describing the characteristic vibration behaviour of the bogie were extracted and exported. Using these modal parameters, the modal transient analysis was performed by exciting the system with stochastic unit-force inputs applied at pre-defined load application points on the bogie frame (cf. Figure 5). The figure shows the dynamic response behaviour of the structure to any stochastic force excitation. This procedure is repeated multiple times to generate a sufficiently large and diverse dataset that spans a broad range of generic load scenarios. For each simulation, the resulting dynamic states are generated at 16 nodes corresponding to the sensor positions used in the in-service measurements, providing the basis for subsequent ML model training.
The FE model of the system was simulated in the time domain using a modal transient analysis under dynamic excitation up to a defined maximum frequency, in order to capture the system response across all relevant natural frequencies and dynamic modes and to generate a comprehensive dataset for analysis.
The upper frequency limit of the excitation was set to 100 Hz to ensure that all relevant natural frequencies and dynamic modes of the system were covered. Frequencies above this limit have been excluded, as the system response is significantly reduced due to damping effects and model uncertainties degrade the reliability of the simulation at higher frequencies. The excitation was applied stochastically at all load application points of the bogie. Subsequently, the resulting system responses were structured and normalized for all measurement locations, following which it was divided into training and test sets to create a comprehensive dataset for ML.
The resulting database consists of input variables, represented by displacements and accelerations at selected points, and output variables, represented by transient forces. These data are stored in structured time-series format and serve as the training and test datasets for the ML model. The model is trained to reconstruct force histories from simulated deformations, enabling accurate inverse force prediction. As direct force measurements on the bogie frame are not feasible under realistic operating conditions, this approach provides a data-driven way to identify the time histories of forces from reconstructed dynamic responses.
6. Machine learning framework
Long short-term memory (LSTM) architecture, first described by Hochreiter and Schmidhuber (1994), is a special variant of recurrent neural networks that was developed to better model long-term dependencies in sequence data. Through so-called gates, LSTM networks can selectively forget information and thus retain relevant information over many time steps. Therefore, LSTM networks are frequently used in the field of time series prediction and predictive maintenance (Elsworth & Güttel, 2020; De Simone et al., 2023).
Other publications have examined the ability of LSTM architectures to model complex temporal dependencies and patterns in time series (Lindemann, Müller, Vietz, Jazdi, & Weyrich, 2021; Hewamalage, Bergmeir, & Bandara, 2021). According to these studies, the high performance of LSTM networks is based not only on greater model complexity but also on their architecture of memory and forget gates, which models temporal dependencies more efficiently. In addition, LSTMs can learn patterns directly from sufficiently long and homogeneous time series. Furthermore, compared to transformers, which are based on self-attention, LSTMs are often more efficient for smaller datasets and sequential predictions with shorter time horizons, while transformers can show advantages especially for very long dependencies and large amounts of data (Zeng, Chen, Zhang, & Xu, 2022).
To adequately model the highly dynamic nature of forces acting on the bogie and the nonlinear influence of bumpstops on the dynamic behaviour of the bogie in the lateral direction, the selected ML model must be capable of representing temporal dependencies and nonlinearities. Furthermore, since only a limited amount of data based on simulation results is available, an LSTM architecture, shown in Figure 6, is used in a first step. The model consists of a recurrent layer of 100 neurons that uses the tanh activation function. The input shape of the LSTM layer is N × 1001, where N denotes the number of sensor signals and 1001 represents the time-series length per sample. The dataset comprises 1024 samples that contain measurements from up to 16 virtual sensors in three spatial directions on the bogie. The data were recorded at a sampling rate of 1000 Hz over a duration of 1 s per sample.
The model was compiled using the Adam optimizer, with mean squared error as the loss function and mean absolute error as an additional performance metric. Training was carried out for 100 epochs with a batch size of 1 and a 80:20 training–validation split. In addition, early stopping was applied with a patience of 20 epochs, and a learning rate scheduler gradually reduced the learning rate by 5% per epoch.
The LSTM network was trained on datasets with varying numbers, positions and arrangements of sensors. During training, randomized forces were applied sequentially over time to the system, and the resulting responses were recorded across all sensor locations and directions. The network was then trained to predict the forces on the bogie frame using only a subset of these measurements. Since the prediction of excitation forces is very sensitive to the input signals, artificial noise was added in the input during training to increase the robustness of the model. The training data were augmented with white Gaussian noise, whose standard deviation was set 0%, 5% and 10% of the standard deviation of the input signals to investigate the influence of noise injection.
Different sensor configurations were tested, first following a fixed numbering scheme and then selecting sensors randomly, to evaluate how placement affected prediction accuracy. This process allowed us to identify an optimal configuration that enabled accurate predictions with the minimum number of sensors, revealing both the essential sensors and their best positions on the bogie frame.
7. Results and discussion
Using the integrated simulation and ML framework developed in this study, the reconstruction of dynamic excitation forces on the bogie frame was evaluated. The quality of the force reconstruction depends strongly on both the number and the specific locations of sensors. The trained LSTM model was evaluated using simulated excitation forces, with displacement data from the simulations serving as input and the corresponding forces predicted by the simulation model as the output. The analysis focuses on the prediction of interface forces acting on the bogie frame based on a rigid MBD model in SIMPACK.
The predicted forces were then compared with the reference simulation results, as illustrated in Figure 7 (left). Dashed lines indicate the predicted forces, while solid lines represent the simulated results, demonstrating good overall agreement. Lateral loads and anti-roll bar forces arise primarily from centrifugal and track-twist-related effects on the wheelsets, whereas vertical loads vary slightly with speed and curve radius. Differences are observed in the prediction of the lateral force of the bumpstop. This is due to the strongly nonlinear nature of the lateral forces on the bogie.
Figure 7 (right) presents the general comparison of predicted forces against the actual reference, with the red line representing an exact fit between the ML predictions and the reference force. Ideally, the developed system would accurately predict the resulting forces on the bogie frame for any number of sensors. In a graphical representation of the predicted values compared to the actual values, all points would therefore lie on a straight line that intersects the two axes at a 45° angle (identity line). In practice, however, deviations occur so that individual points are not exactly on this line. The distance of a point from the identity line corresponds to the respective prediction error. This discontinuity leads to a significant increase in the required training epochs in training. In general, the LSTM model captures the main trends accurately, particularly in the low-frequency range, showing minimal frequency-dependent deviations and consistent phase alignment.
When measuring stresses or accelerations on the real system, the resulting measurements are highly dependent on the position of the sensors. This can lead to significant deviations in the prediction of the resulting forces. Sensors placed in areas where the quantity of interest varies rapidly with space (i.e. has high spatial gradients) show a disproportionate amplification of errors. For accelerometers, the sensitivity results from the local dynamic response of the structure. Ideal positions are located at locations with high modal participation in the respective dynamic response of the system.
Figure 8 shows the relative error with respect to the range of variation of the force amplitude for different intensities of injected noise. The broadband noise was varied on all input variables up to a maximum of 10% of the standards deviation. The prediction accuracy on the noisy signal is less than 4% with a root mean square error; only the lateral forces show a higher error value of about 7%. As the noise on the input signals increases, it becomes apparent that with a suitable sensor placement, the prediction of the vertical forces (F1 and F2) up to a deviation of about 10% of the total value achieves an accuracy that is on the order of magnitude of the noise of the input variables.
For lateral forces (F3 and F4) of the system, on the other hand, a much stronger influence of injected input noise on the prediction accuracy can be observed. In this case, significantly higher prediction errors occur, which can be attributed to the high sensitivity to lateral force components and the nonlinear character of the forces at the bump stop. The forces caused by the rolling moment (F5 and F6), on the other hand, are in the same order of magnitude of the disturbances of the input variables.
The results show that the spatial distribution of the sensors and the local sensitivity of the measuring points are crucial for the accuracy of the force reconstruction. Especially for forces with little direct effect on the measured quantities, such as the lateral components, the position and number of sensors determine the quality of the prediction.
In the following, an analysis of sensor positions is conducted to evaluate the influence of both sensor placement and number on model predictions, as well as on the detection of dynamic and non-linear forces. To this end, the number of sensors used the input vector of the ML model was systematically increased for both training and inference to assess the effect thereof on the resulting prediction accuracy. In addition, the sequence of the sensors was varied. In Figure 9, the mean square error (MSE) for different numbers, combinations and spatial arrangements of the sensors is given.
The results demonstrate that high prediction accuracy can be achieved through appropriate sensor placement, even when only a limited number of sensors are used. A purely random selection of sensors only achieves comparable accuracy with a significantly higher number of sensors. Very good prediction results have already been achieved by a few specifically selected sensors. This indicates that a small number of well-chosen sensors can capture the critical dynamic behaviour of the system, thereby reducing data requirements while maintaining robust force reconstruction.
The method presented here is based on synthetically trained models. On real structures, the accuracy of the system can deviate significantly from the simulation results due to increasing model errors and unaccounted environmental conditions. This necessitates validation on the real system or targeted adaptation to existing measurement data. Nevertheless, the presented approach provides a preliminary framework for data-driven force determination in dynamic mechanical systems. With further refinement and experimental validation, it could support condition monitoring and load reconstruction. The inclusion of higher-frequency dynamics or nonlinear effects could also improve its ability to capture broadband excitation behaviour and supplement faulty measurements. Data-based algorithms cannot completely replace physical measurements, especially in long-term operation and with different bogies. However, they can significantly support predictions and analyses, though some physical uncertainties remain.
8. Conclusions
For the prediction of design loads based on operational data, an integrated framework incorporating MBD, FE and ML models was developed. For this purpose, the characteristic equations were solved in the frequency domain, and the modal transient response of the system to randomized input forces was generated in the time domain. This allowed the entire parameter space to be generated as an extended numerical dataset and, in a second step, used as a training dataset. The ML model trained in this way can predict the resulting loads in operation at the bogie. It was also shown that the required number of sensors can be significantly reduced through appropriate positioning. The work further demonstrated that sensor placement and overall data quality are decisive factors for prediction accuracy.
Furthermore, the ML method shows good robustness against noise and sensor misplacement. The development of the integrated FE simulation and ML framework demonstrates the feasibility of combining physics-based modelling with data-driven methods for improved load estimation in railway vehicle applications. High-quality datasets capturing complex dynamic behaviour, supported by a hybrid approach, in which stochastic unit force training data were generated through FE analysis and validated MBD simulations, enabled the training of predictive models.
Purely simulation-based models enable the prediction of forces and load distributions at the bogie, but the results are strongly model-dependent. Even if the models are validated against reference data, they only reflect an idealized operating condition, and uncertainties in measurement parameters, model parameters, damping behaviour or contact models can significantly affect the accuracy of force predictions. Nevertheless, the purely synthetic approach presented here, based on numerical simulation models, cannot compensate for measurement errors and altered boundary conditions in real physical systems. A detailed adaptation and critical validation of the results for the respective system are required. In addition to environmental conditions, differences between individual systems are also to be expected due to hysteresis effects in damping and strongly nonlinear force responses.
Therefore, the described method is a proof-of-concept application, the real-world effectiveness of which remains to be assessed through future experimental validation as well as field testing. In future work, the method’s predictions should be verified using suitable measurement systems, such as strain gauges or load cells, which should be fitted to the damping systems or at the points of force application. The objective is a direct application for real-time bogie force monitoring in the context of predictive maintenance and a digital twin environment for vehicle dynamics and scenario testing. Furthermore, the approach is intended to support design optimization and the development of control strategies based on reconstructed operational loads. Future work will focus on validating the method, improving vehicle dynamics simulation, expanding scenario testing capabilities and further utilizing reconstructed loads for the structural optimization.










