Skip to Main Content

As the main artery supporting economic activity in Japan, the Metropolitan Expressway comprises many structures that have been in use for a long time since their construction. Due to the challenging operating environment, which includes a higher proportion of heavy vehicles compared to other road bridges, deterioration is remarkably advanced. Maintenance costs have continued to rise, while structural safety must still be ensured within limited budgets and personnel. In recent years, research on deterioration prediction using machine learning and using big data accumulated from repeated inspections has been progressing to address these challenges. This paper aims to achieve risk-based maintenance and management that considers the uncertainty observed in actual deterioration phenomena and proposes a ‘hierarchical deterioration prediction framework’.

The deterioration of social infrastructure is an urgent societal issue confronting many developed nations. Particularly in Japan, infrastructure constructed intensively during the period of rapid economic growth is reaching old age, while a severe labour shortage due to a declining working-age population threatens the sustainability of the social foundation (Japan Society of Civil Engineers, 2020; Nakashima and Nagai, 2014; Nemoto, 2022). Among the various elements of infrastructure, the Metropolitan Expressway, which supports economic activities as Japan’s main artery, has many structures that have been in service for a long time and that are subjected to harsh operating conditions, including a higher proportion of heavy vehicles compared to other road bridges, resulting in significant deterioration. Consequently, maintenance and management costs are continuously increasing, and to ensure the safety of structures within limited budgets and personnel, there is an urgent need to move away from conventional management methods that rely on empirical rules and establish an efficient and rational maintenance and management system utilising data.

Corrosion in steel bridges is among the most critical forms of deterioration to manage, given its direct impact on structural capacity and service life. Damage due to corrosion is characterised by significant differences in the required repair work, depending on its progression. In the early stages, when the corrosion depth is shallow, relatively inexpensive paint repairs are possible, but as deterioration progresses substantial section loss in members occurs, major interventions such as patch-plate reinforcement become unavoidable, causing repair costs to rise from several times greater to as much as 20 times greater. Therefore, to minimise life cycle costs, it is extremely important to intervene with preventive maintenance at the appropriate time before irreparable damage occurs. However, the number of bridges requiring management is enormous, and it is physically and economically impossible to respond immediately to all discovered damage. Thus, rational decision making about which damage to prioritise for repair is indispensable. Traditionally, this prioritisation has heavily relied on engineers’ judgement based on their experience, and involves the use of fragmented information such as inspection reports and site photographs. This approach heavily depends on the tacit knowledge of individual engineers, requires time to make judgements, and has limitations in terms of objectivity and prediction accuracy.

To address these challenges, research on deterioration prediction using machine learning, leveraging big data accumulated from repeated inspections, has been actively conducted worldwide in recent years (Lee et al., 2008; Spencer et al., 2019; Bao and Li, 2021). Attempts to predict future damage occurrence and its progression with high accuracy using methods such as random forest, support vector machines, and neural networks have yielded many successful results. However, many of these prior studies have focused on deterministic binary classification, such as ‘damage occurs/does not occur’, or point prediction of the deterioration amount. Actual deterioration phenomena inherently involve many uncertain factors, such as variations in material properties, differences in local environmental conditions, and measurement errors. A fundamental problem remains that deterministic predictions alone cannot fully evaluate the future risks posed by such deterioration.

Therefore, this research aims to overcome the limitations of conventional deterministic deterioration prediction and ultimately achieve risk-based maintenance and management that considers uncertainty. As an approach to achieving this, this paper proposes a new ‘hierarchical deterioration prediction framework’ consisting of the following two stages:

First stage (wide-area screening): Efficiently narrow down a vast number of sound and damaged locations to identify those with a high probability of future deterioration progression. At this stage, random forest, which has relatively low computational cost and is easy to interpret, is used.

Second stage (detailed risk assessment): Apply Bayesian neural networks (BNN) to the narrowed-down critical locations. A significant feature of BNN is its ability to output prediction results not as a single value, but as a probability distribution indicating the degree of certainty. This makes it possible to probabilistically understand the future corrosion progression rate.

The contributions of this research to academia and practice can be summarised in the following three points:

  • It introduces a Bayesian approach to predict corrosion damage in bridges, presenting a new method to quantify the uncertainty inherent in the prediction.

  • It establishes a practical framework for probabilistically evaluating the member’s capacity reduction rate as a risk, based on the predicted probability distribution of corrosion depth. This allows for the provision of more sophisticated and rational decision making support information, such as ‘there is an X% probability that the capacity will fall below 90% of its sound state in five years’.

  • It demonstrates the effectiveness of the proposed framework using a vast dataset of actual data accumulated on the Metropolitan Expressway, showcasing its practicality.

The structure of this paper is as follows: Section 2 describes the construction and preprocessing of the dataset used in this research. Section 3 details the proposed hierarchical prediction framework and presents the construction results and evaluation of each model using random forest and BNN. Section 4 discusses the method of applying BNN prediction results to structural capacity evaluation. Section 5 provides an overall discussion of the research, and the final Section 6 presents conclusions and prospects.

The data used in the research comprised inspection data, various structure ledgers (including reinforcement history, etc.), traffic volume data, and alignment data obtained from databases held by the Metropolitan Expressway for actual maintenance and management use. Figure 1 shows the specific types of data included and their names.

Figure 1.
A schematic shows how inspection records link to base records using a shared kihon identifier across two tables and then merge into multiple infrastructure registers.The image shows two main tables and their relationship. The upper table is an inspection register with columns for inspection i d, inspection year, route name, position, member, damage summary, and kihon i d. Each row lists an inspection with a corresponding kihon i d. The middle table is a base register with columns for kihon i d, route name, starting pier, ending pier, inbound or outbound direction, and mainline or ramp. Matching kihon i d values in both tables are visually connected to indicate a relational link. The lower section shows that, after linking through the kihon i d, information is merged into several registers, including inspection register, base register, superstructure register, painting register, pavement register, expansion joint register, and traffic volume register, forming an integrated data structure.

Dataset construction process for data analysis

Figure 1.
A schematic shows how inspection records link to base records using a shared kihon identifier across two tables and then merge into multiple infrastructure registers.The image shows two main tables and their relationship. The upper table is an inspection register with columns for inspection i d, inspection year, route name, position, member, damage summary, and kihon i d. Each row lists an inspection with a corresponding kihon i d. The middle table is a base register with columns for kihon i d, route name, starting pier, ending pier, inbound or outbound direction, and mainline or ramp. Matching kihon i d values in both tables are visually connected to indicate a relational link. The lower section shows that, after linking through the kihon i d, information is merged into several registers, including inspection register, base register, superstructure register, painting register, pavement register, expansion joint register, and traffic volume register, forming an integrated data structure.

Dataset construction process for data analysis

Close modal

To organise these data types for deterioration prediction, an integrated analysis dataset was created by linking them based on the basic ledger ID, and for those not linked by the basic ledger ID, by linking them by bridge number and direction. The details are shown in Figure 1.

Subsequently, data preprocessing was performed to enable the application of machine learning models, as the analysis dataset contained a mixture of numerical data, categorical data, missing values, and error values.

Specifically, character data such as bridge type, expansion joint type, elevated structure conditions, and rib shape were corrected for variations in notation and then subjected to one-hot encoding. For missing values and error values in numerical categories such as repair time, repainting time, and slab thickness, imputation methods were considered according to the features, and values were filled using those from adjacent spans, the mean, the mode, or 0.

It is known that the causes of bridge corrosion vary, depending on the location. For example, corrosion at the girder ends is easily caused by water leakage from expansion joints, and corrosion in continuous sections is easily caused by the absence of slab waterproofing and drain holes, as confirmed by past inspection results and previous research. Therefore, as shown in Figure 2, damage occurrence locations were classified into four types: girder end at the start pier side, girder end at the end pier side, continuous section, and cantilever section. These were also subjected to one-hot encoding during the data preprocessing stage.

Figure 2.
A bridge girder underside view highlights girder ends at start and end pier sides and distinguishes continuous and cantilever sections.The image shows a three-dimensional underside view of a bridge superstructure. The upper view labels the girder end at the start pier side on the left and the girder end at the end pier side on the right, each enclosed by a red rectangular outline. The lower view focuses on the underside of the girder and deck system. A red dashed boundary separates two structural regions. The left region is labelled continuous section and shows girders supported across multiple piers. The right region is labelled cantilever section and shows a projecting girder segment extending beyond a pier. Columns, cross beams, and deck elements are visible beneath the superstructure, illustrating the structural difference between continuous and cantilever behaviour.

Classification diagram of damage occurrence locations

Figure 2.
A bridge girder underside view highlights girder ends at start and end pier sides and distinguishes continuous and cantilever sections.The image shows a three-dimensional underside view of a bridge superstructure. The upper view labels the girder end at the start pier side on the left and the girder end at the end pier side on the right, each enclosed by a red rectangular outline. The lower view focuses on the underside of the girder and deck system. A red dashed boundary separates two structural regions. The left region is labelled continuous section and shows girders supported across multiple piers. The right region is labelled cantilever section and shows a projecting girder segment extending beyond a pier. Columns, cross beams, and deck elements are visible beneath the superstructure, illustrating the structural difference between continuous and cantilever behaviour.

Classification diagram of damage occurrence locations

Close modal

First, from the vast amount of damage data, a model was constructed to address two key points: the timing of damage manifestation and the likelihood of its progression, as illustrated in Figure 3. Since the amount of data was large, a random forest algorithm, which has relatively low computational cost and can be expected to achieve high prediction accuracy, was used to construct a prediction model for the occurrence of new damage (Figure 4) and the progression of existing damage (Figure 5). Random forest is a method that constructs a prediction model by creating multiple decision trees, performing ensemble learning using the results of these multiple decision trees, and creating a majority vote. A conceptual diagram is shown in Figure 6.

Figure 3.
A conceptual timeline shows bridge condition degrading over time with marked points for corrosion onset, damage worsening, repair actions, and corrosion progression rate.The diagram plots bridge condition on the vertical axis against time on the horizontal axis. A sloping line represents gradual deterioration. Point 1 marks when corrosion damage becomes apparent. Point 2 indicates a stage where damage worsens, shown by a sharper drop in condition. Interventions labelled minor repairs appear at intermediate stages and partially improve or stabilise the condition trajectory. A later stage shows major repairs, represented by a larger vertical improvement followed by continued gradual decline. Point 3 highlights the question of corrosion progression rate, illustrated by an extended downward shaded segment, emphasising uncertainty in how quickly the condition reduces between inspections and repairs.

The relationship between bridge condition and time, and key points to focus on

Figure 3.
A conceptual timeline shows bridge condition degrading over time with marked points for corrosion onset, damage worsening, repair actions, and corrosion progression rate.The diagram plots bridge condition on the vertical axis against time on the horizontal axis. A sloping line represents gradual deterioration. Point 1 marks when corrosion damage becomes apparent. Point 2 indicates a stage where damage worsens, shown by a sharper drop in condition. Interventions labelled minor repairs appear at intermediate stages and partially improve or stabilise the condition trajectory. A later stage shows major repairs, represented by a larger vertical improvement followed by continued gradual decline. Point 3 highlights the question of corrosion progression rate, illustrated by an extended downward shaded segment, emphasising uncertainty in how quickly the condition reduces between inspections and repairs.

The relationship between bridge condition and time, and key points to focus on

Close modal
Figure 4.
A corroded steel girder connection with heavy rust, section loss around bolts, and damaged mesh protection.The image shows a steel bridge girder connection affected by advanced corrosion. Thick rust layers cover the girder, connection plates, and bolts, with visible section loss along the steel surfaces. A steel plate is fixed to the girder using multiple bolts, all showing corrosion products. The surrounding protective wire mesh is broken, distorted, and partially detached. Dark corrosion stains spread across adjacent steel and concrete surfaces, indicating prolonged exposure and material deterioration at the connection zone.

New damage

Figure 4.
A corroded steel girder connection with heavy rust, section loss around bolts, and damaged mesh protection.The image shows a steel bridge girder connection affected by advanced corrosion. Thick rust layers cover the girder, connection plates, and bolts, with visible section loss along the steel surfaces. A steel plate is fixed to the girder using multiple bolts, all showing corrosion products. The surrounding protective wire mesh is broken, distorted, and partially detached. Dark corrosion stains spread across adjacent steel and concrete surfaces, indicating prolonged exposure and material deterioration at the connection zone.

New damage

Close modal
Figure 5.
A steel bridge girder connection showing localized corrosion and coating loss highlighted by two marked areas near the lower flange.The image shows the underside of a steel bridge girder connection with visible corrosion damage. Two areas of deterioration are circled along the lower edge of the girder web and stiffener region. The protective coating has peeled away, exposing dark, corroded steel with uneven surfaces and material loss. Rust staining extends along the girder edge and around bolted connections. A wire mesh installed below the girder is deformed, sagging, and partially detached, with accumulated corrosion debris resting on it. The surrounding steel members appear generally intact but show early corrosion staining near joints and edges, indicating moisture exposure and ongoing degradation at the connection zone.

Damage progression of existing damage

Figure 5.
A steel bridge girder connection showing localized corrosion and coating loss highlighted by two marked areas near the lower flange.The image shows the underside of a steel bridge girder connection with visible corrosion damage. Two areas of deterioration are circled along the lower edge of the girder web and stiffener region. The protective coating has peeled away, exposing dark, corroded steel with uneven surfaces and material loss. Rust staining extends along the girder edge and around bolted connections. A wire mesh installed below the girder is deformed, sagging, and partially detached, with accumulated corrosion debris resting on it. The surrounding steel members appear generally intact but show early corrosion staining near joints and edges, indicating moisture exposure and ongoing degradation at the connection zone.

Damage progression of existing damage

Close modal
Figure 6.
A schematic shows a dataset split into multiple decision trees whose results combine through majority voting and averaging to give a final result.The diagram shows a dataset at the top that branches into three separate decision trees. Each decision tree contains multiple internal nodes and leaf nodes that process the data independently. Below each tree, a box labelled result shows the output from that tree. The three results then connect to a single box labelled majority voting and averaging, indicating that the individual outputs are combined. A final arrow leads downward to a box labelled final result, representing the aggregated prediction produced from all decision trees together.

Conceptual diagram of random forest

Figure 6.
A schematic shows a dataset split into multiple decision trees whose results combine through majority voting and averaging to give a final result.The diagram shows a dataset at the top that branches into three separate decision trees. Each decision tree contains multiple internal nodes and leaf nodes that process the data independently. Below each tree, a box labelled result shows the output from that tree. The three results then connect to a single box labelled majority voting and averaging, indicating that the individual outputs are combined. A final arrow leads downward to a box labelled final result, representing the aggregated prediction produced from all decision trees together.

Conceptual diagram of random forest

Close modal

Furthermore, since the original dataset had a very large number of features, a trial-and-error approach was used, such as integrating features with importance less than 0.01 or high correlation (e.g. features with a high correlation like the heavy vehicle ratio and traffic volume heavy vehicle), and deleting features considered to be less related to deterioration, to focus on the features that ensured the highest accuracy. The number of data points was 289,224, and the training and test data were split 9:1, with the number of trees in the random forest set to 500. Table 1 shows the initial features and the features after the narrowing-down process.

Table 1.

The initial features and the features after the narrowing-down process

Initial featuresNo. of types
Damage occurrence locations (e.g. Cantilever)4
Repair (e.g. Temporary_Repair)2
Damage summary (e.g. Corrosion)4
Permanent_Repair1
Previous_Assessment1
water leakage1
Main_Girder_Max_Height1
Main_Girder_Min_Height1
Main_Girder_Count1
Main_Girder_Max_Spacing1
Main_Girder_Min_Spacing1
Main_Girder_Span_Count1
Main_Girder_Position_Relationship1
Main_Girder_Span_Length1
Starting_Pier_Width1
Ending_Pier_Width1
Deck_Thickness1
Main_Girder_Max_Spacing1
Deck_Strength1
Main_Rebar_Diameter1
Main_Rebar_Spacing1
Distribution_Rebar_Diameter1
Distribution_Rebar_Spacing1
Distribution_Rebar_Ratio_Percent1
RC Deck Reinforcement (e.g. Short_Plate)3
Waterproofing_Presence1
Distance of current starting expansion joint1
Distance of current ending expansion joint1
Type of bridge superstructure (e.g. Continuous RC T-girder)18
Type of starting bridge bearing (e.g. Move bering)5
Type of ending bridge bearing (e.g. Move bering)5
Design load (e.g. TL_20)3
Rib cross-section shape (e.g. U)6
Design code (e.g. Road_Instruction_Steel_H8)20
Expansion joint starting material (e.g. Rubber)5
Expansion joint ending material (e.g. Rubber)5
Traffic volume1
Heavy vehicle ratio1
Traffic volume heavy vehicle1
Deck type (e.g. Precast_PC)7
Longitudinal gradient1
Coating (e.g. NU-WF-1)36
Rainfall (mm/year)1
Lowest monthly average minimum temperature1
Expressway route no (e.g. No. 1 Haneda Line)28
Area under the elevated structure (e.g. River)9
Inspection interval from the installation date1
Inspection interval from the painting repair date1
Inspection interval from the pavement base layer installation date1
Inspection interval from the stating expansion joint repaired date1
Inspection interval from the ending expansion joint repaired date1
The features after narrowing down (predicting damage progression)No. of types
Damage occurrence locations (e.g. Cantilever)4
Main_Girder_Max_Height1
Main_Girder_Min_Height1
Main_Girder_Max_Spacing1
Main_Girder_Min_Spacing1
Main_Girder_Span_Length1
Deck_Thickness1
Main_Rebar_Spacing1
Distance of current starting expansion joint1
Distance of current ending expansion joint1
Previous_Assessment1
Water leakage1
Inspection interval from the installation date1
Inspection interval from the Painting Repair Date1
Inspection interval from the pavement base layer installation date1
Inspection interval from the stating expansion joint repaired date1
Inspection interval from the ending expansion joint repaired date1
The features after narrowing down (predicting the occurrence of new damage)No. of types
Damage occurrence locations (e.g. Cantilever)4
Main_Girder_Max_Height1
Main_Girder_Min_Height1
Main_Girder_Max_Spacing1
Main_Girder_Min_Spacing1
Main_Girder_Span_Length1
Starting_Pier_Width1
Ending_Pier_Width1
Deck_Thickness1
Main_Rebar_Spacing1
Distribution_Rebar_Ratio_Percent1
Waterproofing_Presence1
Water leakage1
Traffic volume1
Traffic volume heavy vehicle1
Inspection interval from the installation date1
Inspection interval from the painting repair date1
Inspection interval from the pavement base layer installation date1
Inspection interval from the stating expansion joint repaired date1
Inspection interval from the ending expansion joint repaired date1

The prediction results for the occurrence of new damage are shown in Figure 7, with Accuracy: 0.926, Recall: 0.922, Precision: 0.925, and F1 score: 0.924. Similarly, the prediction results for damage progression are shown in Figure 8, with Accuracy: 0.981, Recall: 0.869, Precision: 0.929, and F1 score: 0.898. Accuracy was the best and Recall was poor. As shown in Figure 8, among the cases where progression was actually observed (Observed: 1), the model incorrectly predicted ‘no progression’ (Predicted: 0) for 20.8% of them. This 20.8% represents the percentage of cases where the model incorrectly predicted ‘no progression’ despite progression actually occurring. This is considered a dangerous-side evaluation because it involves overlooking actual deterioration, which poses a direct risk to structural safety.

Figure 7.
A confusion matrix compares observed classes 0 and 1 with predicted classes 0 and 1, showing four cell counts written in scientific notation.The chart shows a two-by-two confusion matrix with observed values on the vertical axis and predicted values on the horizontal axis, labelled 0 and 1. The top left cell shows 1.1 times 10 to the power 4 for observed 0 and predicted 0. The top right cell shows 1.2 times 10 to the power 3 for observed 0 and predicted 1. The bottom left cell shows 9.4 times 10 to the power 2 for observed 1 and predicted 0. The bottom right cell shows 1.6 times 10 to the power 4 for observed 1 and predicted 1.

Prediction results for new damage occurrence

Figure 7.
A confusion matrix compares observed classes 0 and 1 with predicted classes 0 and 1, showing four cell counts written in scientific notation.The chart shows a two-by-two confusion matrix with observed values on the vertical axis and predicted values on the horizontal axis, labelled 0 and 1. The top left cell shows 1.1 times 10 to the power 4 for observed 0 and predicted 0. The top right cell shows 1.2 times 10 to the power 3 for observed 0 and predicted 1. The bottom left cell shows 9.4 times 10 to the power 2 for observed 1 and predicted 0. The bottom right cell shows 1.6 times 10 to the power 4 for observed 1 and predicted 1.

Prediction results for new damage occurrence

Close modal
Figure 8.
A confusion matrix compares observed classes 0 and 1 with predicted classes 0 and 1, with four cell values shown in scientific notation.The chart presents a two-by-two confusion matrix with observed values on the vertical axis and prediction values on the horizontal axis, labelled 0 and 1. The top left cell shows 2.6 times 10 to the power 4 for observed 0 and predicted 0. The top right cell shows 1.8 times 10 to the power 2 for observed 0 and predicted 1. The bottom left cell shows 2.9 times 10 to the power 2 for observed 1 and predicted 0. The bottom right cell shows 1.1 times 10 to the power 3 for observed 1 and predicted 1.

Prediction results for damage progression

Figure 8.
A confusion matrix compares observed classes 0 and 1 with predicted classes 0 and 1, with four cell values shown in scientific notation.The chart presents a two-by-two confusion matrix with observed values on the vertical axis and prediction values on the horizontal axis, labelled 0 and 1. The top left cell shows 2.6 times 10 to the power 4 for observed 0 and predicted 0. The top right cell shows 1.8 times 10 to the power 2 for observed 0 and predicted 1. The bottom left cell shows 2.9 times 10 to the power 2 for observed 1 and predicted 0. The bottom right cell shows 1.1 times 10 to the power 3 for observed 1 and predicted 1.

Prediction results for damage progression

Close modal

On the other hand, when engineers judged whether damage would progress (conventional method), the results in Figure 9 showed Accuracy: 0.657, Recall: 0.692, Precision: 0.310, and F1 score: 0.429.

Figure 9.
A confusion matrix shows observed classes 0 and 1 against predicted classes 0 and 1 with four numeric cell values.The chart shows a two-by-two confusion matrix with Observed on the vertical axis and Prediction on the horizontal axis, each labelled 0 and 1. The top left cell, observed 0 and predicted 0, shows 37. The top right cell, observed 0 and predicted 1, shows 20. The bottom left cell, observed 1 and predicted 0, shows 4. The bottom right cell, observed 1 and predicted 1, shows 9.

Prediction results for damage progression (conventional method)

Figure 9.
A confusion matrix shows observed classes 0 and 1 against predicted classes 0 and 1 with four numeric cell values.The chart shows a two-by-two confusion matrix with Observed on the vertical axis and Prediction on the horizontal axis, each labelled 0 and 1. The top left cell, observed 0 and predicted 0, shows 37. The top right cell, observed 0 and predicted 1, shows 20. The bottom left cell, observed 1 and predicted 0, shows 4. The bottom right cell, observed 1 and predicted 1, shows 9.

Prediction results for damage progression (conventional method)

Close modal

Although engineers spent more than a minute per case checking photos and data for prediction, the resulting accuracy was poor. This is because such a manual, time-consuming process is prone to human error and subjectivity, leading to suboptimal outcomes. This is probably because engineers individually check data such as bridge specifications, repair history, and traffic volume, making it difficult to effectively grasp the interrelationships between the various parameters. By only checking individual data and making judgements based on empirical rules, it becomes challenging to make optimal judgements that consider the interrelationships between parameters.

To address the challenges in damage progression prediction mentioned in the previous section, and particularly to improve the accuracy of cases where ‘no progression’ was predicted but progression was actually observed, an attempt was made to combine image vectors extracted by a convolutional neural network (CNN) with other inspection data, basic structure data, and environmental data, and introduce them into a machine learning model (random forest) (Forkan et al., 2022; Ameli et al., 2024). CNNs primarily consist of convolutional layers, pooling layers, and fully connected layers, and they extract features from images by processing them through these layers. X has dimensions (H, W, C) and the filter F has dimensions (C, FH, FW, D), while the output Yijd is as follows:

Xabc is the (a, b, c) component of the input data, Fc,h,w,d is the (c, h, w, d′) component of the filter, bd is the bias term (a constant existing for each filter), and Yijd is the (i, j, d) component of the output feature map.

In addition, stride and padding are parameters that control the behaviour of the convolution operation. Stride is the amount of movement when sliding the filter and is used to reduce the size of the output feature map. Padding is the process of embedding fixed values around the input data. Zero-padding, in which zeros are used, enables features at image edges to be learned effectively.

The pooling layer is used to reduce the size of the feature maps obtained from the convolutional layer. One purpose of using it is to reduce computational complexity, and another is to absorb minute positional shifts and improve robustness.

In this study, EfficientNetB5 was used, a type of CNN known for its high accuracy and efficiency in image classification tasks. It uses a compound scaling method to optimise the model’s depth, width, and resolution. With this model, stride convolution was performed, and Global Average Pooling 2D was used for the pooling layer when passing the feature map output from the EfficientNetB5 base model to the classification head. The training and test data were split in an 8:2 ratio. Since the amount of data was large, it was loaded sequentially. The data was first loaded in chunks (num_chunks = 356) to manage memory usage, and then batches (batch_size = 16) were generated from these chunks for model training. For the training data, rotation, width/height shift, shear, zoom, horizontal flip, and fill mode were applied to improve generalisation performance.

Specifically, EfficientNetB5, a CNN that exhibits high performance in image recognition, was used (Figure 10).

Figure 10.
A convolutional neural network architecture shows sequential Conv and M B Conv blocks with increasing block counts, feature map sizes, and a fine tuning stage.The image presents a convolutional neural network pipeline starting with an input of 224 by 224 by 3 that passes through a Conv 3 by 3 layer to produce 112 by 112 by 48. This is followed by M B Conv 1 3 by 3 blocks with 3 blocks at 112 by 112 by 24, then M B Conv 6 3 by 3 blocks with 5 blocks at 56 by 56 by 40. The network continues with M B Conv 6 5 by 5 blocks with 5 blocks at 28 by 28 by 64, then 7 blocks at 14 by 14 by 128, and 7 blocks at 14 by 14 by 176. A fine-tuning stage spans these deeper layers. The architecture then includes M B Conv 6 3 by 3 blocks with 3 blocks at 7 by 7 by 304, followed by an M B Conv 6 1 by 1 layer producing 7 by 7 by 512, and ends with a 7 by 7 by 2048 output.

Conceptual diagram of CNN

Figure 10.
A convolutional neural network architecture shows sequential Conv and M B Conv blocks with increasing block counts, feature map sizes, and a fine tuning stage.The image presents a convolutional neural network pipeline starting with an input of 224 by 224 by 3 that passes through a Conv 3 by 3 layer to produce 112 by 112 by 48. This is followed by M B Conv 1 3 by 3 blocks with 3 blocks at 112 by 112 by 24, then M B Conv 6 3 by 3 blocks with 5 blocks at 56 by 56 by 40. The network continues with M B Conv 6 5 by 5 blocks with 5 blocks at 28 by 28 by 64, then 7 blocks at 14 by 14 by 128, and 7 blocks at 14 by 14 by 176. A fine-tuning stage spans these deeper layers. The architecture then includes M B Conv 6 3 by 3 blocks with 3 blocks at 7 by 7 by 304, followed by an M B Conv 6 1 by 1 layer producing 7 by 7 by 512, and ends with a 7 by 7 by 2048 output.

Conceptual diagram of CNN

Close modal

Imagenet weights were used, and the first 477 layers were frozen, while the weights of the latter 101 layers were fine-tuned and changed. For learning settings, the loss function was cross-entropy, the optimisation function was Adam, the learning rate scheduler was 1e−4, and the number of epochs for learning rate reduction when no improvement was observed was set to three. Furthermore, to prevent overfitting to the majority class due to an imbalance between the ‘progression’ and ‘no progression’ classes, weighting was applied according to the number of data points. The results of fine-tuning showed that both accuracy and loss fluctuated but improved with each epoch. At the final epoch 30, as shown in Figure 11, accuracy was 82.73%, val accuracy was 90.54%, loss was 0.2085, and val_loss was 0.1524.

Figure 11.
A training history plot shows accuracy increasing and loss decreasing over 30 epochs for training and validation curves.The image presents two line plots across epochs 0 to 30. The left plot shows model accuracy, where training accuracy increases steadily from about 0.49 to about 0.83, while validation accuracy rises from about 0.43 to around 0.95 with fluctuations and several peaks above training accuracy after epoch 10. The right plot shows model loss, where training loss decreases smoothly from about 0.61 to about 0.20, and validation loss decreases from about 0.84 to about 0.15 with noticeable oscillations. Both plots indicate improving performance over epochs, with validation curves showing higher variability than training curves.

The results of fine-tuning EfficientNetB5 showed both accuracy and loss

Figure 11.
A training history plot shows accuracy increasing and loss decreasing over 30 epochs for training and validation curves.The image presents two line plots across epochs 0 to 30. The left plot shows model accuracy, where training accuracy increases steadily from about 0.49 to about 0.83, while validation accuracy rises from about 0.43 to around 0.95 with fluctuations and several peaks above training accuracy after epoch 10. The right plot shows model loss, where training loss decreases smoothly from about 0.61 to about 0.20, and validation loss decreases from about 0.84 to about 0.15 with noticeable oscillations. Both plots indicate improving performance over epochs, with validation curves showing higher variability than training curves.

The results of fine-tuning EfficientNetB5 showed both accuracy and loss

Close modal

While a model for predicting ‘progression’ or ‘no progression’ from image data could be constructed, the accuracy was lower compared to the model constructed with random forest in Section 3.1. This suggests that the features organised in Figure 1 probably contain more important information related to corrosion progression than the image data.

Using the EfficientNetB5 model feature vectors were extracted from damage images, and a multimodal AI learning attempt was made by combining these features with tabular inspection data, structural data, and environmental data. Since using feature vectors directly would result in an excessive number of features compared to tabular data, principal component analysis was performed for dimensionality reduction. In addition, an attempt was made to extract more important information by introducing an attention mechanism.

As shown in Figure 12, there was no significant improvement in prediction accuracy compared to the random forest standalone model that did not use image data.

Figure 12.
A set of three confusion matrices compares observed and predicted classes 0 and 1 using count values.The image shows three side-by-side confusion matrices, each with observed classes on the vertical axis labelled 0 and 1, and predicted classes on the horizontal axis labelled 0 and 1. In the first matrix, observed 0 and predicted 0 is about 2.3 times 10 to the power 3, observed 0 and predicted 1 is about 1.4 times 10 to the power 2, observed 1 and predicted 0 is about 2.2 times 10 to the power 2, and observed 1 and predicted 1 is about 9.2 times 10 to the power 2. In the second matrix, observed 0 and predicted 0 is about 2.3 times 10 to the power 3, observed 0 and predicted 1 is 77, observed 1 and predicted 0 is about 3.0 times 10 to the power 2, and observed 1 and predicted 1 is about 8.3 times 10 to the power 2. In the third matrix, observed 0 and predicted 0 is about 2.3 times 10 to the power 3, observed 0 and predicted 1 is 20, observed 1 and predicted 0 is about 3.0 times 10 to the power 2, and observed 1 and predicted 1 is about 8.4 times 10 to the power 2.

Confusion matrices for different models

Figure 12.
A set of three confusion matrices compares observed and predicted classes 0 and 1 using count values.The image shows three side-by-side confusion matrices, each with observed classes on the vertical axis labelled 0 and 1, and predicted classes on the horizontal axis labelled 0 and 1. In the first matrix, observed 0 and predicted 0 is about 2.3 times 10 to the power 3, observed 0 and predicted 1 is about 1.4 times 10 to the power 2, observed 1 and predicted 0 is about 2.2 times 10 to the power 2, and observed 1 and predicted 1 is about 9.2 times 10 to the power 2. In the second matrix, observed 0 and predicted 0 is about 2.3 times 10 to the power 3, observed 0 and predicted 1 is 77, observed 1 and predicted 0 is about 3.0 times 10 to the power 2, and observed 1 and predicted 1 is about 8.3 times 10 to the power 2. In the third matrix, observed 0 and predicted 0 is about 2.3 times 10 to the power 3, observed 0 and predicted 1 is 20, observed 1 and predicted 0 is about 3.0 times 10 to the power 2, and observed 1 and predicted 1 is about 8.4 times 10 to the power 2.

Confusion matrices for different models

Close modal

Possible reasons for this include the higher accuracy of the original random forest model compared to the CNN-built model, that is, issues related to the quantity and quality of image data, the CNN model structure or learning method, or the possibility that the information contained in the tabular data was already sufficient, or that important information was lost due to dimensionality reduction. Furthermore, when comparing models with and without the attention mechanism when incorporating image features, a tendency for accuracy to worsen with the introduction of the attention mechanism was observed. It is presumed that this is due to the loss of important information through dimensionality reduction, preventing the attention mechanism from extracting crucial features. In the future, it will be important to fully utilise existing data to improve prediction accuracy, and based on this failure example, further consideration will be given to effectively utilising photographic data to achieve better prediction accuracy.

In previous discussions, models were built to determine whether corrosion would newly occur or whether corrosion damage would progress (0 or 1). While it is possible for these models to predict the progression rate of corrosion damage, constructing a simple prediction model based solely on features was assumed to be difficult. Furthermore, it has been pointed out that deterioration curves calculated from conventional inspection data are susceptible to variations in initial parameters. In addition, the measured corrosion depth itself is likely to have measurement errors and variations. As a solution to this, a model was constructed to predict the distribution of corrosion progression rates using BNN, which can handle uncertainty, by predicting the distribution itself (Figure 13).

Figure 13.
A schematic shows an artificial neural network with input layer units connected through hidden layer units to a single output unit.The schematic presents a simplified artificial neural network structure. The input layer appears on the left and contains several circular units stacked vertically, with ellipses indicating additional inputs. Each input unit connects to multiple units in the hidden layer through straight connecting lines that represent weighted connections. The hidden layer is arranged in two vertical groups of circular units, showing multiple processing stages. Each hidden unit connects forward to the next set of hidden units, again through weighted connections. On the far right, a single circular unit represents the output layer. The overall layout illustrates feedforward information flow from the input layer to the hidden layer and finally to the output layer.

BNN architecture

Figure 13.
A schematic shows an artificial neural network with input layer units connected through hidden layer units to a single output unit.The schematic presents a simplified artificial neural network structure. The input layer appears on the left and contains several circular units stacked vertically, with ellipses indicating additional inputs. Each input unit connects to multiple units in the hidden layer through straight connecting lines that represent weighted connections. The hidden layer is arranged in two vertical groups of circular units, showing multiple processing stages. Each hidden unit connects forward to the next set of hidden units, again through weighted connections. On the far right, a single circular unit represents the output layer. The overall layout illustrates feedforward information flow from the input layer to the hidden layer and finally to the output layer.

BNN architecture

Close modal

Given a dataset D = X, Y, the probability distribution of the network’s weight parameters w is modelled as follows:

pωD is the posterior distribution, p(D|ω) is the likelihood, p(ω) is the prior distribution, and p(D) is the marginal likelihood.

Using the above, the prediction y* for new input data x* is also obtained as the following probability distribution:

Previous research has indicated that corrosion progression rates follow a power law, as it is widely accepted as a model that can well represent the physical corrosion process. Therefore, an investigation was conducted to determine if BNN could be used to predict the parameter distribution of the power law from features and initial corrosion depth and then estimate the corrosion depth at the second inspection. Recently, research on applying the power law to weathering steel has been active, and historically, NCHRP Report 272 presented values for A and B under various conditions based on 15–17 years of exposure test data from Larrabee and Coburn (1961) and Horton (1971). In addition, based on the research of Albrecht and Naeemi (1984) and Kayser and Nowak (1989), Moran Yanez (2016) organised the average values, coefficients of variation, and correlation coefficients of A and B for carbon steel and weathering steel in rural, urban, and coastal environments.

The power law is given by:

where C: corrosion depth (mm), t: exposure time (years), and A, B: parameters. 

The input features were X, initial corrosion depth C_1, corrosion depth at the next inspection C_2, and the period until the next inspection Δt. Since a rational estimation of t_1 was difficult, the value of B was estimated by varying t_1 between 1 and 5.

A was excluded from the values to be estimated because it can be calculated from the formula once C, t, and B are determined or estimated.

For BNN, convergence is a necessary condition. Therefore, the optimal prior distributions for each parameter were tested with a simple model to confirm convergence. Finally, a two-layer hidden network (four nodes in the first layer, three nodes in the second layer) with tanh as the activation function was used.

B was normalised to 0-1 using a sigmoid function based on existing literature, and BNN was performed while varying t_1 between 1 and 5. The maximum value of t_1 was set to 5 because the statutory inspection for road bridges in Japan is stipulated at 5-year intervals, and it was determined that the period under consideration would not exceed 5 years. Markov Chain Monte Carlo was used for posterior distribution estimation, and No-U-Turn Sampler (NUTS) was used to speed up the computation.

With a sample size of 10 000, tuning steps of 20 000, 4 chains, a target acceptance rate of 0.95, and a maximum tree depth of 10, the best accuracy was obtained when t_1 = 1, with 0 divergences, Max R-hat = 1.006, Min Bulk ESS (minimum effective sample size for the central part of the posterior distribution) = 1002, and Min Tail ESS (minimum effective sample size for the tails of the posterior distribution) = 2118. This indicates that the estimation of the posterior distribution converged. Similar convergence of posterior distribution estimation was confirmed for other t_1 values (2, 3, 4, 5) by checking Max R-hat, Min Bulk ESS, and Min Tail ESS.

Since t_1 was varied as 1, 2, 3, 4, and 5, a scatter plot combining the predicted values and actual values for t_1 = 1, 2, 3, 4, 5 is shown in Figure 14. The mean absolute error was 0.613 and the R-squared was 0.291. For BNN model construction and performance validation, the training data and test data were split 8:2.

Figure 14.
A set of three scatter plots compares actual and predicted second corrosion depth using median and uncertainty bounds against a one to one reference line.The image presents three scatter plots showing relationships between actual second corrosion depth and predicted second corrosion depth for combined test data across time steps 1, 2, 3, 4, and 5. The top left plot shows predicted median values against actual values with a diagonal one-to-one reference line, indicating prediction accuracy. Points cluster below the reference line at higher actual depths, showing underprediction. The top right plot shows predicted 2.5 percentile values against actual values, where most points lie well below the reference line, indicating conservative lower bounds. The bottom plot shows predicted 97.5 percentile values against actual values, with many points above the reference line, indicating upper uncertainty limits. Together, the plots illustrate prediction spread, bias, and uncertainty relative to measured corrosion depth.

Distribution of predicted and actual C2 values by BNN (Median, 2.5th percentile, 97.5th percentile)

Figure 14.
A set of three scatter plots compares actual and predicted second corrosion depth using median and uncertainty bounds against a one to one reference line.The image presents three scatter plots showing relationships between actual second corrosion depth and predicted second corrosion depth for combined test data across time steps 1, 2, 3, 4, and 5. The top left plot shows predicted median values against actual values with a diagonal one-to-one reference line, indicating prediction accuracy. Points cluster below the reference line at higher actual depths, showing underprediction. The top right plot shows predicted 2.5 percentile values against actual values, where most points lie well below the reference line, indicating conservative lower bounds. The bottom plot shows predicted 97.5 percentile values against actual values, with many points above the reference line, indicating upper uncertainty limits. Together, the plots illustrate prediction spread, bias, and uncertainty relative to measured corrosion depth.

Distribution of predicted and actual C2 values by BNN (Median, 2.5th percentile, 97.5th percentile)

Close modal

While the median of the predictions shows a certain correlation with the observed values (R-squared = 0.291), most data points fall within the 95% credible interval, suggesting that our model appropriately captures uncertainty.

The prediction results show that when the variation of BNN’s predicted values is small, the tails of the distribution are narrow, indicating confidence in the predicted value. This is important information for maintenance and management; engineers would want to confirm the validity of a prediction model. While a large discrepancy between predicted and actual values would shake the reliability of the prediction model itself, if there is variation in the predicted values and the actual values fall within that variation (e.g. within the 95% credible interval), it can be accepted as a realistic case. Given these results and the nature of infrastructure management, there is a strong need to evaluate corrosion damage conservatively. For example, by using the 97.5th percentile predicted value for management, it becomes possible to set repair priorities while assuming a worst-case scenario. This approach leads to safe and secure infrastructure maintenance.

In Section 3, a BNN model was constructed to predict the corrosion progression rate, including its variability. However, considering the maintenance and management of structures, the objective is not to evaluate the corrosion progression rate itself, but to evaluate how the corrosion progression rate affects the structural capacity. This study investigated whether the capacity reduction rate could be calculated using the 97.5th percentile value of the constructed BNN model in actual maintenance and management situations.

Usukura et al. (2017) organised existing experimental and analytical data and created a diagram showing the influence of corrosion on the capacity of girder ends, using parameters such as the yield capacity reduction rate and remaining plate thickness ratio that can be set from structural specifications, and analysed the capacity of damaged states. This study calculated the reduction rate relative to the yield capacity calculated from structural specifications.

where P: yield capacity calculated from structural specifications, σyw: nominal yield point of the web, σys: nominal yield point of the bearing stiffener, Aew: effective cross-sectional area of the web (where the effective width is calculated as 24tw, and tw is the web thickness), and Aes: effective cross-sectional area of the bearing stiffener.

The yield strength in both the sound state and the corroded state can be calculated using the aforementioned equations and the remaining plate thickness.

Furthermore, using the ultimate strength estimation formula proposed in the study by Usukura et al., the residual yield strength ratio under corrosion damage was calculated. This was done by determining the reduction rate of ultimate strength from the reduction rate of yield strength.

The estimation formula used is described below.

  • Bilateral stiffener defect case

  • Full web defect case on the girder end side

  • Web defect case on the span side, partial web defect case on the girder end side

  • Other defect cases

where Pu: ultimate capacity with corrosion damage, PHu: ultimate capacity in the sound state, Py: yield capacity in the damaged state, and PHy: yield capacity in the sound state

The ultimate capacity evaluation in this study was limited to 65 cases of damage where the web and vertical stiffeners near the bearings were corroded, not all data used in BNN.

Based on the predicted distribution of corrosion progression rates obtained from the BNN model, the capacity of the sound state was calculated using information such as member dimensions and material properties, and the degree of age-related capacity reduction was estimated by considering section loss due to corrosion. Figure 15 shows the capacity remaining ratio from the sound state for the median, 2.5th percentile, and 97.5th percentile of the corrosion depth predicted by the BNN model.

Figure 15.
A set of three scatter plots compares observed and predicted values using median, 2.5 percentile, and 97.5 percentile against a one to one reference line.The image shows three scatter plots comparing observed values on the horizontal axis with predicted values on the vertical axis, both expressed in percent. The top plot presents predicted median versus observed values, with most points clustering near the one-to-one reference line at higher observed values and greater spread at lower observed values. The middle plot shows predicted 2.5 percentile versus observed values, where most points lie above the reference line, indicating conservative lower-bound predictions. The bottom plot shows the predicted 97.5 percentile versus observed values, with many points below the reference line, indicating upper-bound predictions. Together, the plots illustrate prediction accuracy and uncertainty across the observed range.

Predicted versus observed capacity remaining ratio from BNN (Median, 2.5th percentile, 97.5th percentile)

Figure 15.
A set of three scatter plots compares observed and predicted values using median, 2.5 percentile, and 97.5 percentile against a one to one reference line.The image shows three scatter plots comparing observed values on the horizontal axis with predicted values on the vertical axis, both expressed in percent. The top plot presents predicted median versus observed values, with most points clustering near the one-to-one reference line at higher observed values and greater spread at lower observed values. The middle plot shows predicted 2.5 percentile versus observed values, where most points lie above the reference line, indicating conservative lower-bound predictions. The bottom plot shows the predicted 97.5 percentile versus observed values, with many points below the reference line, indicating upper-bound predictions. Together, the plots illustrate prediction accuracy and uncertainty across the observed range.

Predicted versus observed capacity remaining ratio from BNN (Median, 2.5th percentile, 97.5th percentile)

Close modal

As mentioned earlier, in infrastructure maintenance and management, management on the safe side is extremely important. When comparing the most conservative 97.5th percentile value with the actual capacity reduction rate, only two out of 65 data points showed that the predicted value exceeded the actual value (i.e. evaluated on the dangerous side) (Figure 15). This result, as stated in the objective, enables a safe-side evaluation of the capacity reduction rate by evaluating the corrosion progression rate including variability and adopting the more severe value, leading to the rationalisation of maintenance and management based on data.

Furthermore, it was confirmed that the variability in the capacity remaining ratio was smaller compared to the variability in the corrosion progression rate. This is probably because even if corrosion progresses to some extent, if it is localised, stress is redistributed to sound members, and thus it does not immediately have a significant impact on the overall structural capacity.

This study constructed a prediction model for the corrosion progression rate that incorporates uncertainty using BNN, and by applying this model to capacity evaluation, proposed a method that allows for safe-side capacity evaluation and contributes to the rationalisation of maintenance and management based on data.

However, this is merely a simplified evaluation for specific parts, and more detailed capacity evaluation would require advanced methods such as finite element method (FEM) analysis. In the future, if it becomes possible to construct FEM models at low cost, for example, by acquiring point clouds or utilising generative AI, it is expected that precise capacity prediction will become possible through corrosion progression rate prediction. In addition, by linking inspection results to building information modelling (BIM) models, it will be possible to link inspection results with structural members, plate thickness, and materials, enabling instantaneous calculation of capacity remaining ratios through simple formulas, which is expected to further enhance the efficiency and sophistication of maintenance and management.

This research aimed to overcome the limitations of conventional deterministic deterioration prediction and ultimately achieve risk-based maintenance and management that considers uncertainty. To achieve this objective, the paper proposed a new ‘hierarchical deterioration prediction framework’ consisting of the following two stages:

First stage (wide-area screening): Efficiently narrow down a vast number of sound and damaged locations to identify locations with a high probability of future deterioration progression. At this stage, Random Forest, which has relatively low computational cost and is easy to interpret, is used.

Second stage (detailed risk assessment): Apply BNN to the narrowed-down critical locations. A significant feature of BNN is its ability to output prediction results not as a single value, but as a probability distribution indicating the degree of certainty. This makes it possible to probabilistically understand the future corrosion progression rate.

Existing research has frequently focused on constructing deterioration prediction models for bridges and predicting deterioration levels, but many studies have been limited to multi-class classification, allowing only for selection of repair priorities based on classes. As a result, repair priorities have been set for each class. In contrast, this research specifically focused on the difficult-to-predict corrosion progression rate, constructed a deterioration prediction model that considers uncertainty, and newly proposed a method for predicting the posterior distribution of the corrosion progression rate.

By constructing this deterioration prediction model, which considers uncertainty, data-driven decision making for maintenance and management becomes possible before corrosion progresses to a costly state requiring patching, thereby reducing the costs associated with investigation and repair. Furthermore, by using conservative values from the predicted posterior distribution of the corrosion progression rate, unexpected corrosion damage progression can be prevented.

After constructing the corrosion damage deterioration prediction model, a proposed flow for its practical application is shown in Figure 16.

Figure 16.
A flowchart shows how inspection data feed a degradation prediction process that guides monitoring, repair prioritisation, and maintenance timing decisions.The flowchart presents a decision process for infrastructure management. It begins with updating and linking inspection date, repair and reinforcement data, structural specifications data, traffic volume data, and alignment data. These inputs support the development of a degradation prediction model. A decision asks where predicted locations of damage occur. If none, inspection intervals are extended. If yes, further decisions check whether corrosion-related thickness loss has begun, whether damage is at the end of the girder, and whether damage is predicted to progress. If progression is expected, corrosion progression rate determination and structural capacity assessment follow. Outcomes lead to monitoring and prioritising repairs. The final stage supports optimisation of inspection and maintenance timing and overall maintenance and management.

Construction of a corrosion damage deterioration prediction model, and a repair plan flow for setting inspection timing and repair priorities using that model

Figure 16.
A flowchart shows how inspection data feed a degradation prediction process that guides monitoring, repair prioritisation, and maintenance timing decisions.The flowchart presents a decision process for infrastructure management. It begins with updating and linking inspection date, repair and reinforcement data, structural specifications data, traffic volume data, and alignment data. These inputs support the development of a degradation prediction model. A decision asks where predicted locations of damage occur. If none, inspection intervals are extended. If yes, further decisions check whether corrosion-related thickness loss has begun, whether damage is at the end of the girder, and whether damage is predicted to progress. If progression is expected, corrosion progression rate determination and structural capacity assessment follow. Outcomes lead to monitoring and prioritising repairs. The final stage supports optimisation of inspection and maintenance timing and overall maintenance and management.

Construction of a corrosion damage deterioration prediction model, and a repair plan flow for setting inspection timing and repair priorities using that model

Close modal

As described in Section 2, first, inspection data and various other data are accumulated. Then, based on this accumulated data, a deterioration prediction model is constructed, and the risk of damage occurrence and damage progression is evaluated based on the prediction. A flow for formulating a repair plan is proposed, which involves setting inspection timing and repair priorities according to these risks. It is also expected that as more inspection data and various other data are accumulated, and as more accurate data is accumulated, the accuracy of the prediction model will improve, leading to a cycle of rationalised maintenance and management.

This research focused only on corrosion damage; thus, further research is needed on other types of damage that impact safety to rationalise bridge maintenance and management. Additionally, while a simplified formula was used to link the corrosion progression rate to capacity evaluation, it only indirectly evaluated the capacity of actual structures.

In the future, to further rationalise maintenance and management, the aim is to construct similar models for fatigue cracks and concrete spalling, which can lead to serious incidents such as structural capacity reduction and third-party damage. It is also intended to develop an integrated deterioration prediction model and an optimised maintenance and management system for entire routes, not just individual bridge condition assessments.

Furthermore, it is planned to strengthen the integration of this framework with several technologies currently under research, including: artificial Intelligence (AI)-based damage detection (e.g. for corrosion, cracks, and spalling) based on image and point cloud analysis (Chun and Kikuta, 2024; Lin et al., 2025a, 2025b; Yamashita et al., 2025); technology for automatically generating text descriptions (documentation) of detected damage states from images (Chun et al., 2024; Yamane et al., 2024); and the enhancement of inspection recording efficiency using BIM models (Hattori et al., 2024).

The damage identified and recorded by these AI technologies (including its type, location, and condition) will be automatically mapped onto the corresponding members in the BIM model. The probabilistic deterioration progression, such as the corrosion predicted by the BNN model from this study, will then be automatically associated with this member information.

By directly exporting this combined data to FEM analysis software to perform detailed structural analysis, it will be possible to predict the entire bridge's remaining structural capacity while accounting for uncertainty. This probabilistic capacity prediction is expected to contribute to the establishment of long-term repair plans and the execution of cost simulations for multiple repair scenarios (e.g. preventive maintenance vs. corrective maintenance) within a digital space.

This research, against the backdrop of serious challenges posed by aging social infrastructure and labour shortages, proposed a hierarchical deterioration prediction framework using machine learning to enhance the maintenance and management of steel bridge corrosion damage, and demonstrated the effectiveness of the framework using actual data from the Metropolitan Expressway.

The main achievements obtained through this research are summarised in the following three points:

  • Tt was shown that a highly accurate model can be constructed using random forest to efficiently screen for critical locations from a vast number of inspection points.

  • As the core approach of this research, BNN was applied, successfully quantifying the corrosion progression rate as a probability distribution including its uncertainty, rather than a single predicted value.

  • By combining the probabilistic prediction results from BNN with structural engineering capacity evaluation formulas, a practical method for probabilistically evaluating future capacity reduction risk was established. This demonstrated that it is possible to prioritise repairs based on objective evidence, tailored to the safety level required by managers.

In conclusion, the hierarchical deterioration prediction framework proposed and demonstrated in this research is an effective approach for transitioning from conventional experience-based maintenance and management to a more rational and efficient system based on data and probabilistic risk evaluation. The results of this research will contribute to extending the lifespan of infrastructure and reducing life cycle costs, and mark an important step toward realising sustainable social infrastructure.

Albrecht
P
and
Naeemi
AH
(
1984
) Performance of weathering steel in bridges. In
NCHRP Report 272
.
Transportation Research Board, National Research Council
,
Washington, DC, USA
.
Ameli
Z
,
Nesheli
SJ
and
Landis
EN
(
2024
)
Deep learning-based steel bridge corrosion segmentation and condition rating using Mask R-CNN and YOLOv8
.
Infrastructures
9
(1)
:
3
, .
Bao
Y
and
Li
H
(
2021
)
Machine learning paradigm for structural health monitoring
.
Structural Health Monitoring
20
(
4
):
1353
1372
, .
Chun
PJ
and
Kikuta
T
(
2024
)
Self‐training with Bayesian neural networks and spatial priors for unsupervised domain adaptation in crack segmentation
.
Computer-Aided Civil and Infrastructure Engineering
39
(17)
:
2642
2661
, .
Chun
P-J
,
Chu
H
,
Shitara
K
et al.
(
2024
)
Implementation of explanatory texts output for bridge damage in a bridge inspection web system
.
Advances in Engineering Software
195
:
103706
, .
Forkan
ARM
,
Kang
YB
,
Jayaraman
PP
et al.
(
2022
)
CorrDetector: a framework for structural corrosion detection from drone images using ensemble deep learning
.
Expert Systems with Applications
193
:
116461
, .
Hattori
K
,
Oki
K
,
Sugita
A
,
Sugiyama
T
and
Chun
PJ
(
2024
)
Deep learning-based corrosion inspection of long-span bridges with BIM integration
.
Heliyon
10
(15)
:
e35308
, .
Horton
JB
(
1971
) The rusting of low alloy steels in the atmosphere. In
Booklet 2385-A
.
Bethlehem Steel Corporation
,
Bethlehem, PA
.
Japan Society of Civil Engineers (JSCE)
(
2020
)
Japan's Infrastructure Grades 2020 & Introduction of Maintenance Technologies
.
Japan Society of Civil Engineers
,
Tokyo
.
Kayser
JR
and
Nowak
AS
(
1989
)
Reliability of corroded steel girder bridges
.
Structural Safety
6
(1)
:
53
63
, .
Larrabee
CP
and
Coburn
SK
(
1961
) The atmospheric corrosion of steels as influenced by chemical composition. In
First International Congress on Metallic Corrosion
.
Butterworths
,
London
, pp.
276
292
.
Lee
J
,
Sanmugarasa
K
,
Blumenstein
M
and
Loo
YC
(
2008
)
Improving the reliability of a Bridge Management System (BMS) using an ANN-based Backward Prediction Model (BPM)
.
Automation in Construction
17
(
6
):
758
772
, .
Lin
C
,
Abe
S
,
Zheng
S
,
Li
X
and
Chun
PJ
(
2025
a)
A structure-oriented loss function for automated semantic segmentation of bridge point clouds
.
Computer-Aided Civil and Infrastructure Engineering
40
(6)
:
801
816
, .
Lin
C
,
Chen
Y
,
Itakura
K
,
Maharjan
S
and
Chun
P
(
2025
b)
Bridge inspection using image-point cloud fusion with image filtering, damage detection and 3D registration
.
Automation in Construction
180
:
106538
, .
Moran Yanez
LM
(
2016
)
Bridge Maintenance to Enhance Corrosion Resistance and Performance of Steel Girder Bridges
.
PhD thesis
,
Purdue University
West Lafayette, IN, USA
.
Nakashima
M
and
Nagai
K
(
2014
)
An investigation of road bridge maintenance system in Japan in developed society
.
Society for Social Management Systems Internet Journal
9
(1)
:
1
8
, .
Nemoto
Y
(
2022
)
Considerations on infrastructure aging and renewal investment financing
.
Public Policy Review
18
(2)
.
Spencer
BF
,
Hoskere
V
and
Narazaki
Y
(
2019
)
Advances in computer vision-based civil infrastructure inspection and monitoring
.
Engineering
5
(
2
):
199
222
, .
Usukura
M
,
Miyashita
T
,
Sasaki
E
et al.
(
2017
)
A study on estimation method of ultimate strength of corroded steel girder ends
.
Journal of Japan Society of Civil Engineers, Ser. A1 (Structural Engineering & Earthquake Engineering (SE/EE))
73
(3)
:
560
578
, .
Yamane
T
,
Chun
P-J
,
Dang
J
et al.
(
2024
)
Deep learning-based bridge damage cause estimation from multiple images using visual question answering
.
Structure and Infrastructure Engineering
1
14
, .
Yamashita
M
,
Kawanishi
K
,
Hashizume
K
et al.
(
2025
)
Infrared thermography and 3D pavement surface unevenness measurement algorithm for damage assessment of concrete bridge decks
.
Computer-Aided Civil and Infrastructure Engineering
40
(19)
:
2772
2792
, .
Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at Link to the terms of the CC BY 4.0 licenceLink to the terms of the CC BY 4.0 licence.

or Create an Account

Close Modal
Close Modal