Research on elevator door anomaly detection method based on multisource spatial-temporal information fusion

https://doi.org/10.1016/j.compind.2024.104111

Allen

and

Cordiner

(

2024

), “

Knowledge-Enhanced spatiotemporal analysis for anomaly detection in process manufacturing

”,

Computers in Industry

, Vol.

161

, 104111, doi:

https://doi.org/10.1088/1742-6596/1939/1/012026

Bai

Huang

Ning

Deng

Gan

and

Liu

(

2021

), “

Building elevator safety monitoring system based on the BIM technology

”,

Journal of Physics: Conference Series

, Vol.

1939

No.

, 012026, doi:

Beck

Pöppel

Spanring

Auer

Prudnikova

Kopp

Klambauer

Brandstetter

and

Sepp Hochreiter

(

2024

), “

xLSTM: extended long short-term memory

”,

Advances in Neural Information Processing Systems

https://doi.org/10.52202/079017-3417

, Vol.

, pp.

107547

107603

, doi:

https://doi.org/10.1109/tcyb.2021.3059002

Chen

Peng

and

Yang

(

2022

), “

Graph convolutional network-based method for fault diagnosis using a hybrid of measurement and prior knowledge

”,

IEEE Transactions on Cybernetics

, Vol.

No.

, pp.

9157

9169

, doi:

https://doi.org/10.1016/j.ymssp.2015.07.005

Esteban

Salgado

Iturrospe

and

Isasa

(

2016

), “

Model-based approach for elevator performance estimation

”,

Mechanical Systems and Signal Processing

, Vols

68-69

, pp.

125

137

, doi:

https://doi.org/10.1109/TASE.2023.3309927

Fathizadan

and

Yang

(

2024

), “

Deep spatio-temporal anomaly detection in laser powder bed fusion

”,

IEEE Transactions on Automation Science and Engineering

, Vol.

No.

, pp.

5227

5239

, doi:

https://doi.org/10.1109/JSEN.2025.3540781

Feng

Guo

Gao

and

Liu

(

2025a

), “

A multisource state space-based tool remaining useful life prediction method considering multistage degradation characteristics

”,

IEEE Sensors Journal

, Vol.

No.

, pp.

11216

11225

, doi:

https://doi.org/10.1016/j.measurement.2025.116797

Feng

Ding

Yin

Wang

Zhang

Liu

Yuan

and

(

2025b

), “

Scraper conveyor gearbox fault diagnosis based on multi-source heterogeneous data fusion

”,

Measurement

, Vol.

247

, 116797, doi:

https://doi.org/10.1109/TKDE.2021.3056502

Guo

Lin

Wan

and

Cong

(

2022

), “

Learning dynamics and heterogeneity of spatial-temporal graph data for traffic forecasting

”,

IEEE Transactions on Knowledge and Data Engineering

, Vol.

No.

, pp.

5415

5428

, doi:

https://doi.org/10.1109/tim.2024.3373804

Guo

Niu

Zhao

and

Jia

(

2024a

), “

Operation condition assessment for elevators based on deep siamese network and t-S semi-supervision model

”,

IEEE Transactions on Instrumentation and Measurement

, Vol.

, pp.

, doi:

https://doi.org/10.1109/TIM.2023.3334350

Guo

Duan

and

Gao

(

2024b

), “

An analysis method for interpretability of convolutional neural network in bearing fault diagnosis

”,

IEEE Transactions on Instrumentation and Measurement

, Vol.

, pp.

, doi:

https://doi.org/10.1016/j.eswa.2025.126533

Han

Huang

and

Cao

(

2025

), “

Multi-sensor bearing fault diagnosis based on evidential neural network with sensor weights and reliability

”,

Expert Systems with Applications

, Vol.

269

, 126533, doi:

https://doi.org/10.1109/ACCESS.2020.3037185

Hsu

C.-Y.

Qiao

Wang

and

Chen

S.-T.

(

2020

), “

Machine learning modeling for failure detection of elevator doors by three-dimensional video monitoring

”,

IEEE Access

, Vol.

, pp.

211595

211609

, doi:

Kipf

T.N.

and

Welling

(

2016

), “

Semi-supervised classification with graph convolutional networks

”,

https://doi.org/10.48550/arXiv.1609.02907

, doi:

https://doi.org/10.2478/amns.2021.2.00003

Lan

Jiang

Qiu

Wan

Chen

and

Alam

(

2021

), “

Statistical analysis of typical elevator accidents in China from 2002 to 2019

”,

Applied Mathematics and Nonlinear Sciences

, Vol.

No.

, pp.

193

208

, doi:

and

Zhu

(

2020

), “

Spatial-temporal fusion graph neural networks for traffic flow forecasting

”,

In Proceedings of the AAAI Conference on Artificial Intelligence

https://doi.org/10.1609/aaai.v35i5.16542

, Vol.

No.

, pp.

4189

4196

, doi:

https://doi.org/10.1609/aaai.v32i1.11604

Han

and

(

2018

), “

Deeper insights into graph convolutional networks for semi-supervised learning

”,

Proceedings of the AAAI Conference on Artificial Intelligence

, Vol.

No.

, doi:

https://doi.org/10.3390/s19040972

Liu

Zhou

Zhao

Shen

and

Xiong

(

2019

), “

Fault diagnosis of rotating machinery under noisy environment conditions based on a 1-D convolutional autoencoder and 1-D convolutional neural network

”,

Sensors, MDPI AG

, Vol.

No.

, p.

972

, doi:

https://doi.org/10.1016/j.isatra.2025.09.041

Zhang

Xiao

and

Wang

(

2025

), “

A multi-scale convolution capsule network with data augmentation and attention mechanisms for elevator fault diagnosis

”,

ISA Transactions

, Vol.

167

, pp.

1873

1887

, doi:

https://doi.org/10.1016/j.inffus.2024.102780

Kim

B.-G.

Parameshachari

B.D.

Slowik

and

(

2025

), “

Large model-driven hyperscale healthcare data fusion analysis in complex multi-sensors

”,

Information Fusion

, Vol.

115

, 102780, doi:

https://doi.org/10.1016/j.ymssp.2025.112429

Niu

Yang

Jia

Jin

and

Luo

(

2025

), “

Performance evaluation of elevators using a novel hierarchical softmax regression model

”,

Mechanical Systems and Signal Processing

, Vol.

228

, 112429, doi:

https://doi.org/10.3390/math12010113

Pan

Xiang

Gong

and

Shen

(

2023

), “

Risk evaluation of elevators based on fuzzy theory and machine learning algorithms

”,

Mathematics

, Vol.

No.

, p.

113

, doi:

https://doi.org/10.3390/s24072135

Pan

Shao

Dai

Wei

Chen

and

Lin

(

2024

), “

Research on fault prediction method of elevator door system based on transfer learning

”,

Sensors

, Vol.

No.

, p.

2135

, doi:

https://doi.org/10.1016/j.knosys.2020.106561

Zhang

Jia

Mao

Wang

and

Song

(

2021

), “

Deep face clustering using residual graph convolutional network

”,

Knowledge-Based Systems

, Vol.

211

, 106561, doi:

https://doi.org/10.1109/jsen.2023.3332755

Rao

Zeng

and

Cheng

(

2024

), “

A novel interpretable model via algorithm unrolling for intelligent fault diagnosis of machinery

”,

IEEE Sensors Journal

, Vol.

No.

, pp.

495

505

, doi:

https://doi.org/10.11591/ijai.v7.i3.pp138-142

Rawat

A.S.

Rana

Kumar

and

Bagwari

(

2018

), “

Application of multi layer artificial neural network in the diagnosis system: a systematic review

”,

IAES International Journal of Artificial Intelligence (IJ-AI), Institute of Advanced Engineering and Science

, Vol.

No.

, p.

138

, doi:

https://doi.org/10.1109/78.650093

Schuster

and

Paliwal

K.K.

(

1997

), “

Bidirectional recurrent neural networks

”,

IEEE Transactions on Signal Processing

, Vol.

No.

, pp.

2673

2681

, doi:

https://doi.org/10.1016/j.engappai.2024.108846

Seo

Noh

Kang

Y.-J.

Lim

Ahn

Song

and

Kim

K.C.

(

2024

), “

Graph neural networks for anomaly detection and diagnosis in hydrogen extraction systems

”,

Engineering Applications of Artificial Intelligence

, Vol.

135

, 108846, doi:

https://doi.org/10.1109/ICCV48922.2021.01102

Sofianos

Sampieri

Franco

and

Galasso

(

2021

), “

Space-time-Separable graph convolutional network for pose forecasting

”,

2021 IEEE/CVF International Conference on Computer Vision (ICCV), Presented at the 2021 IEEE/CVF International Conference on Computer Vision (ICCV)

Montreal, QC

IEEE

, pp.

11189

11198

, doi:

https://doi.org/10.1016/j.inffus.2024.102708

Sun

and

Yin

(

2025

), “

Multi-sensor temporal-spatial graph network fusion empirical mode decomposition convolution for machine fault diagnosis

”,

Information Fusion

, Vol.

114

, 102708, doi:

https://doi.org/10.1088/1757-899x/428/1/012028

Wang

Leng

Zhang

Zhu

and

Zhang

(

2018

), “

MCU system-based intelligent high-speed elevator door operator fault analysis and research

”,

IOP Conference Series: Materials Science and Engineering

, Vol.

428

, 012028, doi:

https://doi.org/10.1038/s41598-024-78784-7

Wang

Chen

Xiao

Wang

and

(

2024

), “

Elevator fault diagnosis based on digital twin and PINNs-e-RGCN

”,

Scientific Reports

, Vol.

No.

, 30713, doi:

https://doi.org/10.1038/s41598-025-04620-1

Wang

Yin

She

Tong

Zhang

and

(

2025

), “

Bearing fault diagnosis for variable operating conditions based on KAN convolution and dual branch fusion attention

”,

Scientific Reports

, Vol.

No.

, 21442, doi:

Pan

Long

Jiang

Chang

and

Zhang

(

2020

), “

Connecting the dots: multivariate time series forecasting with graph neural networks

”,

In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

https://doi.org/10.1145/3394486.3403118

, pp.

753

763

, doi:

https://doi.org/10.1016/j.aei.2024.102687

Tan

Zhang

and

Dong

(

2024

), “

Joint mining of fluid knowledge and multi-sensor data for gas–water two-phase flow status monitoring and evolution analysis

”,

Advanced Engineering Informatics

, Vol.

, 102687, doi:

https://doi.org/10.1016/j.compeleceng.2024.109660

Xie

Zhang

and

Qiu

(

2024

), “

An elevator door anomaly detection method based on improved deep multi-sphere support vector data description

”,

Computers and Electrical Engineering

, Vol.

120

, 109660, doi:

https://doi.org/10.1007/s00521-025-11509-7

Xiao

Yao

Zhong

Xiao

and

(

2025a

), “

MB-ViT: MBConv vision transformer with time–frequency feature fusion for bearing fault diagnosis

”,

Neural Computing and Applications

, Vol.

No.

, pp.

22801

22825

, doi:

https://doi.org/10.1016/j.inffus.2025.103420

Xiao

Dornaika

Charafeddine

and

(

2025b

), “

Metric learning-enhanced semi-supervised graph convolutional network for multi-view learning

”,

Information Fusion

, Vol.

124

, 103420, doi:

https://doi.org/10.1016/j.knosys.2024.112781

and

(

2025

), “

Enhancing information fusion and feature selection efficiency via the PROMETHEE method for multi-source dynamic decision data sets

”,

Knowledge-Based Systems

, Vol.

309

, 112781, doi:

https://doi.org/10.1016/j.eswa.2023.121338

Yan

Shao

Wang

Zheng

and

Liu

(

2024

), “

LiConvFormer: a lightweight fault diagnosis framework using separable multiscale convolution and broadcast self-attention

”,

Expert Systems with Applications

, Vol.

237

, 121338, doi:

https://doi.org/10.1109/tie.2022.3176280

Yang

Tao

and

Zhong

(

2023a

), “

Compound Fault diagnosis of harmonic drives using deep capsule graph convolutional network

”,

IEEE Transactions on Industrial Electronics

, Vol.

No.

, pp.

4186

4195

, doi:

https://doi.org/10.1016/j.measurement.2023.113053

Yang

Ding

Geng

Jiang

and

Zou

(

2023b

), “

A multi-sensor mapping Bi-LSTM model of bridge monitoring data based on spatial-temporal attention mechanism

”,

Measurement

, Vol.

217

, 113053, doi:

https://doi.org/10.1088/1361-6501/ac5deb

Zhang

Gao

and

Shi

(

2022a

), “

Bearing fault diagnosis method based on multi-source heterogeneous information fusion

”,

Measurement Science and Technology

, Vol.

No.

, 075901, doi:

https://doi.org/10.3390/machines10040249

Zhang

and

Han

(

2022b

), “

Real-time motor fault diagnosis based on TCN and attention

”,

Machines

, Vol.

No.

, p.

249

MDPI AG

, doi:

https://doi.org/10.1109/tii.2022.3224979

Zhao

and

Jiao

(

2023

), “

A fault diagnosis method for rotating machinery based on CNN with mixed information

”,

IEEE Transactions on Industrial Informatics

, Vol.

No.

, pp.

9091

9101

, doi:

https://doi.org/10.1016/j.inffus.2024.102462

Zhou

and

Wang

(

2024

), “

MST-GAT: a multi-perspective spatial-temporal graph attention network for multi-sensor equipment remaining useful life prediction

”,

Information Fusion

, Vol.

110

, 102462, doi:

2026

Qingyun Li, Gongning Li, Mukai Wang, Jiefeng Li, Yongqing Wang, Mingxing Jia and Dapeng Niu

Figure 1

The conceptual workflow diagram shows a multi-source signal analysis system that combines vibration and video signals using a “T S G C N Architecture” (Temporal–Spatial Graph Convolutional Network) for classification. The diagram is organized into four main sections labeled “Multi-source Signal Acquisition”, “T S G C N Architecture”, “Feature Fusion”, and “Result”. “Multi-source Signal Acquisition”: This section appears on the left side of the diagram and illustrates the input data sources. A smartphone icon indicates a vibration signal acquisition device. Three stacked waveform plots show vibration signals measured along three axes labeled “x-axis”, “y-axis”, and “z-axis”. Each waveform is displayed in a different color: blue for the x-axis, green for the y-axis, and red for the z-axis, representing time-varying vibration amplitudes. Below the vibration signals, a camera icon represents video signal acquisition. A step-like blue waveform labeled “Video Signal” illustrates the temporal representation of video-derived features or motion information extracted from recorded frames. Right-pointing arrows from both vibration and video signals indicate that these data streams are sent to the processing architecture. “T S G C N Architecture”: The central section illustrates the processing architecture labeled “T S G C N Architecture”. It is divided into two main sections: “Temporal Feature Extraction” at the top and “Spatial Feature Extraction” at the bottom. “Temporal Feature Extraction”: This section is enclosed within a dashed boundary and further divided into “Local Feature Extraction” and “Global Feature Extraction”. “Local Feature Extraction”: On the left side, a sequence of vertical feature blocks represents feature maps processed through convolutional operations. Arrows indicate the flow through layers labeled “CONV layer 1”, “Pooling layer 1”, “CONV layer 2”, and “Pooling layer 2”. These layers extract temporal patterns from input signals by progressively reducing dimensionality while preserving important features. “Global Feature Extraction”: On the right side, a graph-based recurrent structure processes temporal dependencies. Nodes labeled “x subscript 1” and “x subscript t” represent input features at different time steps. These connect to intermediate nodes labeled “h subscript 1” and “h subscript t” through weighted edges labeled “w subscript 1”, “w subscript 2”, “w subscript 3”, “w subscript 4”, “w subscript 5”, and “w subscript 6”. Two pathways labeled “Forward layer” and “Backward layer” indicate bidirectional processing of temporal information. Arrows between nodes show how information flows forward and backward across time steps. Output nodes labeled “y subscript 1” and “y subscript t” represent the extracted temporal features after global processing. “Spatial Feature Extraction”: This section appears below and focuses on extracting relationships between features using graph-based methods. Multiple stacked blocks labeled “Graph Convolution” illustrate repeated graph convolution operations. Inside each block, a network of interconnected nodes represents a graph structure where nodes exchange information. Each graph convolution block is followed by layers labeled “Batch Norm” and “R e L U”, indicating normalization and nonlinear activation. On the right side of the spatial section, a module labeled “Graph Readout” aggregates node-level features into a single global representation. A graph with connected nodes is shown, followed by a red circular output node. The aggregation method is labeled “global average pooling”, indicating that features from all nodes are averaged to produce the final output. “Feature Fusion”: The lower-left section shows how features from different modalities are combined. It is divided into two main parts: feature fusion at the top and a neural network classifier at the bottom. On the upper left side, a dashed box labeled “Fusion of Temporal Features” displays two horizontal rows of circular nodes representing temporal feature vectors. The top row is labeled “Vibrate Features” and contains a sequence of circular nodes representing temporal features extracted from vibration signals. The bottom row is labeled “Video Features” and contains a similar sequence of circular nodes representing temporal features extracted from video data. A plus symbol between the two rows indicates that vibration and video temporal features are combined to produce a fused temporal representation. On the upper right side, another dashed box labeled “Fusion of Spatial Features” shows a similar structure. The upper row represents spatial features derived from vibration signals, while the lower row represents spatial features derived from video signals. Each row contains circular nodes representing feature elements. A plus symbol between the rows indicates that the spatial features from both modalities are fused together. Below the two fusion blocks, arrows from both the temporal and spatial fusion outputs converge into a label “Concat”, indicating that the fused temporal and spatial features are concatenated into a single combined feature vector. The concatenated feature vector is passed into a neural network classifier illustrated in the lower section. A vertical column of nodes represents the input feature vector. An arrow labeled “R e L U” indicates the application of the Rectified Linear Unit activation function. The features then pass through a fully connected neural network layer illustrated by multiple nodes connected with lines, representing learned weights between layers. On the right side, a vertical column of nodes labeled “Output Classes” represents the final classification results. Each node corresponds to a predicted class category generated by the model. “Result”: The rightmost section presents the final evaluation results, consisting of two visualizations: a scatter plot of classification outputs and a confusion matrix summarizing model performance. At the top, a scatter plot displays clustered data points representing different classes predicted by the model. The plot includes a legend labeled “Class” with four categories: “Jamming Fault”, “Door Control Fault”, “Slowdown Fault”, and “Normal”. Each class is represented by a distinct color. The data points form four clearly separated clusters in different regions of the plot, indicating strong class separation. One cluster appears on the left side around negative horizontal values, another cluster appears near the upper center, a third cluster is slightly lower but still near the center-right, and a fourth cluster appears on the lower right side. The separation between clusters suggests that the model effectively distinguishes between different fault conditions and normal operation. Below the scatter plot, a matrix labeled “Confusion Matrix” presents classification performance in a grid format. The vertical axis is labeled “True”, and the horizontal axis is labeled “Predicted”, with class indices ranging from 0 to 3. The matrix contains four rows and four columns with numerical values indicating prediction counts: Row 0 (True class 0): 38 correct predictions, with 1 misclassified as class 1 and 1 as class 3. Row 1 (True class 1): 38 correct predictions, with 2 misclassified as class 0. Row 2 (True class 2): 40 correct predictions, with no misclassifications. Row 3 (True class 3): 39 correct predictions, with 2 misclassified as class 1. The diagonal values are high compared to off-diagonal values, indicating strong overall classification accuracy. Misclassifications are minimal and occur only between a few class pairs. Note: All numerical data values are approximated.

Convolutional network fusion model for spatial-temporal maps. Source(s): Figure created by authors

Figure 1

Convolutional network fusion model for spatial-temporal maps. Source(s): Figure created by authors

Figure 2

A diagram shows graph construction from data using pairwise relationships between nodes.

The conceptual workflow shows how raw data is transformed into a graph structure based on relationships between features or nodes. On the left side, a box labeled “Data” contains a simple line plot representing an input signal or time-series data. Below it, an arrow points downward to a small network diagram composed of circular nodes connected by lines, indicating an initial graph representation derived from the data. In the center, a large rounded box illustrates how relationships between nodes are computed. Inside this box, two elements labeled “A subscript i” and “A subscript j” represent two nodes or features. A function is defined below them as “A subscript i, j equals f (A subscript i, A subscript j)”. To the right, two possible outcomes are shown: when “A subscript i, j equals 0”, the nodes “A subscript i” and “A subscript j” are displayed without a connecting line, indicating no edge between them; when “A subscript i, j equals 1”, the nodes are connected by a line, indicating the presence of an edge. On the far right, an arrow points to a more complex network graph with multiple nodes and connections.

Construction of time series graph structure. Source(s): Figure created by authors

Figure 3

A diagram of “Local” and “Global Feature Extraction” with attention-based key temporal features.

The detailed pipeline for feature extraction from time-series data is divided into three labeled sections: “Local Feature Extraction”, “Global Feature Extraction”, and “Key Features”. On the left side of the section “Local Feature Extraction”, an input signal is shown as a vertical waveform plot. Segmented portions of the signal are highlighted and fed into two parallel processing streams with an ellipsis between them. Each stream begins with a block labeled “CONV”, representing convolutional layers applied to extract local patterns. The output passes through blocks labeled “B N” (batch normalization) and “R e L U” activation. Circular nodes labeled “Max” indicate “Max pooling” operations that reduce dimensionality while preserving important features. This sequence—“CONV”, “B N”, “R e L U”, and “Max pooling”—is repeated twice in each stream, producing stacked feature maps. The outputs from multiple streams are then combined into vertical feature vectors, each shown by a rectangle containing stacked circular nodes, representing extracted local temporal features. In the center, a section labeled “Global Feature Extraction” models temporal dependencies using a bidirectional structure. Input nodes labeled “x subscript 1” and “x subscript t” represent features at different time steps. These connect to hidden nodes labeled “vector h subscript 1” and “vector h subscript t” through weighted connections labeled “w subscript 1”, “w subscript 2”, “w subscript 3”, “w subscript 4”, “w subscript 5”, and “w subscript 6”. Two pathways are shown: a “Forward layer” and a “Backward layer”, indicating bidirectional processing of temporal information. Arrows illustrate the flow of information across time steps in both directions. Output nodes labeled “y subscript 1” and “y subscript t”, with an ellipsis between them, represent globally extracted temporal features. On the right side, a section labeled “Key Features” applies a block labeled “Multi-head Attention”. This module takes the globally extracted features and computes attention weights to emphasize the most important temporal information. The output is a set of three stacked circular nodes, with an ellipsis, labeled “Temporal Features”, representing refined feature vectors after attention-based selection.

Time dimension model. Source(s): Figure created by authors

Figure 3

Time dimension model. Source(s): Figure created by authors

Figure 4

A diagram showing graph convolution layers extracting spatial features from time-series data.

The diagram presents a pipeline for extracting spatial features from time-series data using graph convolutional networks. The diagram is divided into several labeled sections: “Data”, “G C N 1”, “G C N 2”, “G C N 3”, and the final output labeled “Spatial Features”, each connected by a right-pointing arrow. On the left side, a dashed box labeled “Data” contains a waveform plot representing a time-series signal. Below the waveform, a legend shows colored circular nodes labeled “Sample point 1”, “Sample point 2”, “Sample point 3”, followed by an ellipsis, and “Sample point n”. These colored nodes represent individual data samples that will be treated as nodes in a graph. An arrow points from the data section toward the first graph convolution block. The first processing block is labeled “G C N 1”. At the top of the block, a diagram labeled “Graph Convolution” shows a small network of connected nodes representing the graph structure. Below it, two sequential layers are labeled “Batch Norm” and “R e L U”, indicating batch normalization followed by a rectified linear unit activation. The second block labeled “G C N 2” repeats the same structure. A graph convolution layer processes node relationships, followed by a “Batch Norm” layer and a “R e L U” activation layer. The third block labeled “G C N 3” again contains a “Graph Convolution” diagram followed by “BatchNorm” and “R e L U”. Arrows between the blocks show the flow of information from one layer to the next. After the third graph convolution block, the output is passed to a node labeled “G A P”, which stands for “Global Average Pooling”. This operation aggregates node-level information into a single feature representation. The final output appears as a vertical column of circular nodes labeled “Spatial Features”, representing the extracted spatial feature vector derived from the graph-based processing of the data.

Spatial dimension model. Source(s): Figure created by authors

Figure 5

A pair of photographs shows video and vibration data acquisition setups for an elevator door system.

The left panel labeled “(a)” shows the video data acquisition setup inside an elevator with metallic interior walls. A smartphone mounted on a small holder is attached to the wall and is labeled “Video Collector”, with its screen displaying a recording interface. Above the smartphone near the ceiling, a dome-shaped surveillance camera is visible. The surrounding surfaces appear metallic and reflective, forming the interior structure of the elevator cabin. The right panel labeled “(b)” shows the vibration data acquisition setup near the elevator doorway. The panel displays two metallic sliding door sections identified by the labels “Landing Door” on the left and “Car Door” on the right. Between the door panels, a small rectangular device labeled “Vibration sensor” is mounted vertically along the door frame. Thin annotation lines connect each label to the corresponding component, indicating the sensor placement and the positions of the two doors.

Device acquisition diagram. Source(s): Figure created by authors

Figure 6

A set of twelve line graphs shows acceleration over time across three axes for different operating conditions.

The twelve panels are arranged in three rows and four columns, grouped into four conditions labeled “(a)”, “(b)”, “(c)”, and “(d)” at the bottom, each containing three line graphs for “Axis X”, “Axis Y”, and “Axis Z”. In all panels, the horizontal axis is labeled “Time step” and ranges approximately from 0 to 600 in (a), (b), and (d) in increments of 100 units and from 0 to 1000 in (c) in increments of 200 units. The vertical axis is labeled “Acceleration (meters per second squared)”. In condition “(a)”, the legend on each plot is labeled “Normal-Axis X”, “Normal-Axis Y”, and “Normal-Axis Z”. The vertical axis ranges from negative 7.5 to 5.0 in increments of 2.5 in the top plot, from negative 4 to 2 in increments of 2 in the middle plot, and from negative 5.0 to 7.5 in increments of 2.5 in the bottom plot. The signals across all three axes show moderate fluctuations around zero with occasional spikes, including a noticeable peak around time step 200 and another cluster of activity near 500 to 600, with brief sharp dips and rises indicating transient motion. In condition “(b)”, the legend on each plot is labeled “Slowdown-Axis X”, “Slowdown-Axis Y”, and “Slowdown-Axis Z”. The vertical axis ranges from negative 10.0 to 5.0 in increments of 2.5 in the top plot, from negative 2 to 6 in increments of 2 in the middle plot, and from negative 10 to 5 in increments of 5 in the bottom plot. The signals display stronger variability compared to normal, with more frequent spikes and wider amplitude changes, including pronounced peaks around time steps 100 to 200 and again near 500 to 600, along with intermittent quieter intervals. In condition “(c)”, the legend on each plot is labeled “Jamming-Axis X”, “Jamming-Axis Y”, and “Jamming-Axis Z”. The vertical axis ranges from negative 2 to 4 in increments of 2 in the top plot, from negative 10 to 5 in increments of 5 in the middle plot, and from negative 4 to 6 in increments of 2 in the bottom plot. The signals show irregular and abrupt bursts with larger amplitude deviations, including sharp spikes and sudden drops, particularly strong negative excursions in the middle plot and clustered oscillations around time steps near 600 to 800, indicating unstable behavior. In condition “(d)”, the legend on each plot is labeled “Abnormal Door Closing-Axis X”, “Abnormal Door Closing-Axis Y”, and “Abnormal Door Closing-Axis Z”. The vertical axis ranges from negative 5 to 7.5 in increments of 2.5 in the top plot, from negative 2 to 6 in increments of 2 in the middle plot, and from negative 5 to 7.5 in increments of 2.5 in the bottom plot. The signals exhibit strong early fluctuations with distinct peaks around time steps 100 to 200, followed by relatively stable segments and later renewed activity near 500, with noticeable spikes and uneven oscillations across all three axes. Note: All numerical data values are approximated.

Three-dimensional vibration signals under different states. (a) Normal. (b) Slowdown. (c) Jamming. (d) Abnormal door closing. Source(s): Figure created by authors

Figure 6

Three-dimensional vibration signals under different states. (a) Normal. (b) Slowdown. (c) Jamming. (d) Abnormal door closing. Source(s): Figure created by authors

Figure 7

A photograph shows an elevator doorway with vertical reference lines and a graph of door position over time.

The left panel labeled “(a)” shows an elevator entrance framed by metallic sliding doors that are partially open, exposing a light-colored interior wall and a closed gray door with a handle in the background. The door panels have visible vertical seams and reflective surfaces. Two thin vertical reference lines are overlaid near the left and right edges of the doorway, aligned with the door boundaries. On the left door panel, small circular safety icons are arranged vertically, and on the right side, a notice board and additional signage are visible on the wall next to the door frame. The elevator frame and surrounding panels appear smooth and metallic, with straight edges and a rectangular opening. The right panel labeled “(b)” contains a line graph with a legend labeled “Door position”, representing a line. The horizontal axis is labeled “time” and ranges from negative 2 to 16 in increments of 2 units. The vertical axis is labeled “Pixel value” and ranges from 300 to 600 in increments of 50 units. The plotted line begins near 320 around time 0, increases sharply around time 2 to reach 590 near time 4, remains nearly constant close to 590 until about time 10, then decreases rapidly after time 11 and returns to 320 by around time 13, remaining stable afterward. Note: All numerical data values are approximated.

Video processing. (a) Edge position of the door. (b) Door displacement curve. Source(s): Figure created by authors

Figure 8

A set of four line graphs shows elevator door velocity over time for different operating conditions.

The four panels in a two-by-two grid are labeled “(a)”, “(b)”, “(c)”, and “(d)”, each showing a line graph with a legend labeled “Normal”, “Slowdown”, “Jamming”, and “Abnormal Door Closing”, respectively. In all panels, the horizontal axis is labeled “Time step” and ranges from 0 to 400 in (a) and (b) in increments of 100 units, from 0 to 400 in (c) in increments of 200 units, and from 0 to 300 in (d) in increments of 100 units. The vertical axis is labeled “Velocity (meters per second)” and ranges from negative 2 to 2 in increments of 1 unit in (a), (b), and (d), and from negative 1 to 1 in (c). In panel “(a)”, the velocity increases from near 0 to above 2 in early time steps, then drops to 0 and remains stable before decreasing sharply to around negative 2 near time step 300 and finally returning toward 0. In panel “(b)”, the velocity follows a similar pattern with an initial rise above 2, a flat region at 0, then a drop to around negative 2 after time step 300, followed by a gradual return toward 0. In panel “(c)”, the velocity rises to around 1, stabilizes briefly, then drops to near 0, followed by a gradual decline to around negative 1 near time step 350, and then fluctuates slightly while remaining below 0. In panel “(d)”, the velocity increases to above 2 in the early phase, quickly drops to 0, remains flat for a period, then decreases sharply to near negative 2 around time step 250, and ends with slight fluctuations below 0. Note: All numerical data values are approximated.

Elevator door operating curves under different states. (a) Normal. (b) Slowdown. (c) Jamming. (d) Abnormal door closing. Source(s): Figure created by authors

Figure 9

A heatmap shows a confusion matrix comparing true and predicted class labels.

The heatmap is titled “Confusion Matrix”, showing a four-by-four grid of values. The horizontal axis is labeled “Predicted” and includes class labels 0, 1, 2, and 3. The vertical axis is labeled “True” and includes class labels 0, 1, 2, and 3. Each cell contains a numeric value representing the count of predictions for each true class. In the first row for true class 0, the values are 38 under predicted 0, 1 under predicted 1, 0 under predicted 2, and 1 under predicted 3. In the second row for true class 1, the values are 2 under predicted 0, 38 under predicted 1, 0 under predicted 2, and 0 under predicted 3. In the third row for true class 2, the values are 0 under predicted 0, 0 under predicted 1, 40 under predicted 2, and 0 under predicted 3. In the fourth row for true class 3, the values are 0 under predicted 0, 2 under predicted 1, 0 under predicted 2, and 39 under predicted 3. The diagonal cells contain the highest values, indicating correct classifications, while the off-diagonal cells contain small values representing misclassifications.

Confusion matrix result. Source(s): Figure created by authors

Figure 10

A scatter plot shows clusters of four classes in a two-dimensional feature space.

The plot displays a scatter distribution of data points grouped into four classes with a legend titled “Class” identifying “Jamming”, “Abnormal Door Closing”, “Slowdown”, and “Normal”. The horizontal axis ranges from negative 10 to 15 in increments of 5 units, and the vertical axis ranges from negative 5 to 15 in increments of 5 units. The points form distinct clusters in different regions of the plot. The “Abnormal Door Closing” cluster is located on the left side around horizontal values near negative 12 and vertical values around 4 to 5, forming a compact group. The “Jamming” cluster appears on the lower right side around horizontal values near 12 to 14 and vertical values around negative 5 to negative 3. The “Slowdown” cluster is positioned in the upper middle-right region around horizontal values near 4 to 5 and vertical values around 8 to 10. The “Normal” cluster is located slightly above and to the right of the slowdown cluster, around horizontal values near 5 to 7 and vertical values around 11 to 14. The clusters are well separated with minimal overlap, indicating a clear distinction among the four classes. Note: All the numerical data values are approximated.

UMPA visualization result. Source(s): Figure created by authors

Figure 11

A set of four line graphs shows sensitivity analysis of model parameters and accuracy.

The four panels arranged in a two-by-two grid are labeled “(a)”, “(b)”, “(c)”, and “(d)”, each showing a line graph illustrating the sensitivity of different parameters on model accuracy. In all panels, the vertical axis is labeled “Accuracy” and ranges from approximately 0.88 to 0.98 in increments of 0.02 in panels “(a)” and “(b)”, from 0.80 to 1.00, with the intermediate markings at 0.83, 0.85, 0.88, 0.90, 0.93, 0.95, and 0.98 in panel (c), and from 0.88 to 1.00 in increments of 0.02 in panel “(d)”. In panel “(a)” titled “Sensitivity of K in K N N”, the horizontal axis is labeled “K in K N N” and ranges from 1 to 10 in increments of 1 unit. The plotted points fluctuate around 0.90 to 0.96, increasing from about 0.90 at K equals 1 to around 0.945 at K equals 2, decreasing slightly at K equals 3, rising again and reaching the highest value near 0.96 at K equals 5, then dropping to about 0.90 at K equals 7 before gradually increasing toward approximately 0.94 at K equals 10. In panel “(b)” titled “Sensitivity of Distance Metric in K N N”, the horizontal axis is labeled “Distance Metric” and includes three categorical values: “Euclidean”, “Manhattan”, and “Chebyshev”. The plotted values show the highest accuracy near 0.964 for Euclidean, slightly lower near 0.943 for Manhattan, and the lowest around 0.91 for Chebyshev, indicating a decreasing trend. In panel “(c)” titled “Sensitivity of Learning Rate”, the horizontal axis is labeled “Learning Rate” and includes values 10 to the negative 5 power, 5 times 10 to the negative 5 power, 10 to the negative 4 power, 5 times 10 to the negative 4 power, and 10 to the negative 3 power. The plotted accuracy rises from approximately 0.85 at 10 to the negative 5 power to a peak around 0.96 at 10 to the negative 4 power, then decreases to about 0.91 at 5 times 10 to the negative 4 power and further to roughly 0.85 at 10 to the negative 3 power. In panel “(d)” titled “Sensitivity of Dropout”, the horizontal axis is labeled “Dropout Rate” and ranges from 0.1 to 0.5 in increments of 0.1. The plotted values increase from approximately 0.913 at 0.1 to about 0.96 at 0.3, then decrease to around 0.93 at 0.4 before slightly increasing again to near 0.94 at 0.5. Note: All numerical data values are approximated.

Accuracy comparison of different hyperparameters. (a) Sensitivity of K in KNN. (b) Effect of distance metric on KNN. (c) Sensitivity of learning rate. (d) Sensitivity of dropout. Source(s): Figure created by authors

Figure 11

Figure 12

Two bar charts compare performance scores of six models on video and vibration datasets.

https://doi.org/10.1016/j.engappai.2024.109375

The two side-by-side grouped bar charts are labeled “(a)” and “(b)”. In both charts, the horizontal axis lists the metrics “Accuracy”, “Precision”, “Recall”, and “F 1”, and the vertical axis is labeled “Score” ranging from 0.65 to 1.00 in increments of 0.05. Each metric group contains six bars corresponding to the models “Li Conv Former”, “T C N”, “C N N”, “Bi L S T M”, “G C N”, and “T S G C N”. In panel “(a)” titled “Video Dataset”, the bars show approximate values where Li Conv Former achieves about 0.75 accuracy, 0.77 precision, 0.75 recall, and 0.755 F 1; T C N shows around 0.80 accuracy, 0.81 precision, 0.80 recall, and 0.798 F 1; C N N records about 0.75 accuracy, 0.763 precision, 0.75 recall, and 0.75 F 1; Bi L S T M shows about 0.81 accuracy, 0.817 precision, 0.808 recall, and 0.81 F 1; G C N reaches approximately 0.85 accuracy, 0.86 precision, 0.85 recall, and 0.85 F 1; and T S G C N shows the highest values around 0.875 accuracy, 0.88 precision, 0.875 recall, and 0.875 F 1. In panel “(b)” titled “Vibration Dataset”, the bars indicate Li Conv Former with about 0.775 accuracy, 0.77 precision, 0.775 recall, and 0.77 F 1; T C N with approximately 0.825 accuracy, 0.853 precision, 0.825 recall, and 0.817 F 1; C N N with about 0.82 accuracy, 0.818 precision, 0.82 recall, and 0.818 F 1; Bi L S T M with around 0.925 accuracy, 0.927 precision, 0.925 recall, and 0.925 F 1; G C N with roughly 0.70 accuracy, 0.70 precision, 0.70 recall, and 0.70 F 1; and T S G C N with the highest scores near 0.95 across accuracy, precision, recall, and F 1. Note: All numerical data values are approximated.

Performance comparison of anomaly detection models under single-sensor conditions. (a) Video data performance comparison. (b) Vibration data performance comparison. Source(s): Figure created by authors

Table 1

Description of elevator door conditions

Condition	Label	Number of training/validation/testing samples
Normal	0	120/40/40
Slowdown	1	120/40/40
Jamming	2	120/41/40
Abnormal door closing	3	120/40/41

Source(s): Table created by authors

Table 2

Structure of the network module

Module name	Functional	Network architecture
BiLSTM	Local feature extraction	Conv1d(k = 3, s = 1, p = 1), BatchNorm1d(64), ReLU()
		MaxPool1d()
		Conv1d(k = 3, s = 1, p = 1), BatchNorm1d(128), ReLU()
		MaxPool1d()
	Contextual relationship	BiLSTM(hidden = 128(Video)/256(Vibrate))
	Focus on important features	Multi-head Attention(num_heads = 4)
GCN	Spatial feature	GCNConv(hidden = 128(Video)/256(Vibrate)), ReLU()
		GCNConv(), ReLU()
		GCNConv(), ReLU(), Global Average Pooling()
Characteristic fusion	Multisource spatial-temporal characterization	Concat()
		Linear()
		Dropout(0.3)
		Linear()

Module name	Functional	Network architecture
BiLSTM	Local feature extraction	Conv1d(k = 3, s = 1, p = 1), BatchNorm1d(64), ReLU()
		MaxPool1d()
		Conv1d(k = 3, s = 1, p = 1), BatchNorm1d(128), ReLU()
		MaxPool1d()
	Contextual relationship	BiLSTM(hidden = 128(Video)/256(Vibrate))
	Focus on important features	Multi-head Attention(num_heads = 4)
GCN	Spatial feature	GCNConv(hidden = 128(Video)/256(Vibrate)), ReLU()
		GCNConv(), ReLU()
		GCNConv(), ReLU(), Global Average Pooling()
Characteristic fusion	Multisource spatial-temporal characterization	Concat()
		Linear()
		Dropout(0.3)
		Linear()

Source(s): Table created by authors

Table 3

Training configuration parameters in the network

Parameter	Set value
Optimizer	AdamW
Initial learning rate	1e−4
Weight decay	5e−2
Scheduling strategy	ReduceLROnPlateau
Loss function	CrossEntropy
Epoch	200
Batch size	16

Source(s): Table created by authors

Table 4

Performance comparison of anomaly detection models under single sensor conditions

Methods	Results of the comparison of the two datasets
	Video data				Vibration data
	Accuracy	Precision	Recall	F1	Accuracy	Precision	Recall	F1
LiConvFormer	75.16%	76.95%	75.17%	75.50%	77.64%	77.01%	77.56%	76.82%
TCN	80.12%	81.09%	80.09%	79.80%	82.61%	85.29%	82.53%	81.68%
CNN	75.16%	76.28%	75.06%	75.16%	81.99%	81.83%	81.91%	81.76&
BiLSTM	80.75%	81.65%	80.70%	81.02%	92.55%	92.69%	92.50%	92.44&
GCN	85.09%	85.81%	85.06%	85.09%	70.19%	70.12%	70.17%	70.08&
TSGCN	87.58%	87.97%	87.55%	87.53%	95.03%	94.99%	95.00%	94.97%

Methods	Results of the comparison of the two datasets
	Video data				Vibration data
	Accuracy	Precision	Recall	F1	Accuracy	Precision	Recall	F1
LiConvFormer	75.16%	76.95%	75.17%	75.50%	77.64%	77.01%	77.56%	76.82%
TCN	80.12%	81.09%	80.09%	79.80%	82.61%	85.29%	82.53%	81.68%
CNN	75.16%	76.28%	75.06%	75.16%	81.99%	81.83%	81.91%	81.76&
BiLSTM	80.75%	81.65%	80.70%	81.02%	92.55%	92.69%	92.50%	92.44&
GCN	85.09%	85.81%	85.06%	85.09%	70.19%	70.12%	70.17%	70.08&
TSGCN	87.58%	87.97%	87.55%	87.53%	95.03%	94.99%	95.00%	94.97%

Source(s): Table created by authors

Table 5

Comparison of the performance of fusion models and non-spatiotemporal methods in anomaly detection

Methods	Accuracy	Precision	Recall	F1
MLP	64.60%	64.65%	64.70%	64.08%
xLSTM	77.64%	78.88%	77.56%	77.80%
mixCNN	74.53%	74.07%	74.44%	73.73%
ResCISTA-Net	65.84%	65.43%	65.81%	65.50%
TSGCN	96.27%	96.30%	96.28%	96.28%

Source(s): Table created by authors

Table 6

Performance comparison of state-of-the-art spatio-temporal models for anomaly detection

Methods	Accuracy	Precision	Recall	F1
MTGNN	77.02%	78.03%	77.02%	76.13%
ASTGNN	78.88%	79.30%	78.88%	78.97%
STFGNN	77.02%	80.22%	77.02%	75.66%
STSGCN	66.46%	66.39%	66.46%	66.13%
TSGCN	96.27%	96.30%	96.28%	96.28%

Source(s): Table created by authors

Abebe

Kim

S.Y.

Koo

and

Jeong

H.-S.

(

2024

), “

Adaptive signal fusion for swashplate pump fault detection using bidirectional long short-term memory and wavelet scattering transform

”,

Engineering Applications of Artificial Intelligence

, Vol.

138

, 109375, doi:

https://doi.org/10.1016/j.compind.2024.104111

Allen

and

Cordiner

(

2024

), “

Knowledge-Enhanced spatiotemporal analysis for anomaly detection in process manufacturing

”,

Computers in Industry

, Vol.

161

, 104111, doi:

https://doi.org/10.1088/1742-6596/1939/1/012026

Bai

Huang

Ning

Deng

Gan

and

Liu

(

2021

), “

Building elevator safety monitoring system based on the BIM technology

”,

Journal of Physics: Conference Series

, Vol.

1939

No.

, 012026, doi:

Beck

Pöppel

Spanring

Auer

Prudnikova

Kopp

Klambauer

Brandstetter

and

Sepp Hochreiter

(

2024

), “

xLSTM: extended long short-term memory

”,

Advances in Neural Information Processing Systems

https://doi.org/10.52202/079017-3417

, Vol.

, pp.

107547

107603

, doi:

https://doi.org/10.1109/tcyb.2021.3059002

Chen

Peng

and

Yang

(

2022

), “

Graph convolutional network-based method for fault diagnosis using a hybrid of measurement and prior knowledge

”,

IEEE Transactions on Cybernetics

, Vol.

No.

, pp.

9157

9169

, doi:

https://doi.org/10.1016/j.ymssp.2015.07.005

Esteban

Salgado

Iturrospe

and

Isasa

(

2016

), “

Model-based approach for elevator performance estimation

”,

Mechanical Systems and Signal Processing

, Vols

68-69

, pp.

125

137

, doi:

https://doi.org/10.1109/TASE.2023.3309927

Fathizadan

and

Yang

(

2024

), “

Deep spatio-temporal anomaly detection in laser powder bed fusion

”,

IEEE Transactions on Automation Science and Engineering

, Vol.

No.

, pp.

5227

5239

, doi:

https://doi.org/10.1109/JSEN.2025.3540781

Feng

Guo

Gao

and

Liu

(

2025a

), “

A multisource state space-based tool remaining useful life prediction method considering multistage degradation characteristics

”,

IEEE Sensors Journal

, Vol.

No.

, pp.

11216

11225

, doi:

https://doi.org/10.1016/j.measurement.2025.116797

Feng

Ding

Yin

Wang

Zhang

Liu

Yuan

and

(

2025b

), “

Scraper conveyor gearbox fault diagnosis based on multi-source heterogeneous data fusion

”,

Measurement

, Vol.

247

, 116797, doi:

https://doi.org/10.1109/TKDE.2021.3056502

Guo

Lin

Wan

and

Cong

(

2022

), “

Learning dynamics and heterogeneity of spatial-temporal graph data for traffic forecasting

”,

IEEE Transactions on Knowledge and Data Engineering

, Vol.

No.

, pp.

5415

5428

, doi:

https://doi.org/10.1109/tim.2024.3373804

Guo

Niu

Zhao

and

Jia

(

2024a

), “

Operation condition assessment for elevators based on deep siamese network and t-S semi-supervision model

”,

IEEE Transactions on Instrumentation and Measurement

, Vol.

, pp.

, doi:

https://doi.org/10.1109/TIM.2023.3334350

Guo

Duan

and

Gao

(

2024b

), “

An analysis method for interpretability of convolutional neural network in bearing fault diagnosis

”,

IEEE Transactions on Instrumentation and Measurement

, Vol.

, pp.

, doi:

https://doi.org/10.1016/j.eswa.2025.126533

Han

Huang

and

Cao

(

2025

), “

Multi-sensor bearing fault diagnosis based on evidential neural network with sensor weights and reliability

”,

Expert Systems with Applications

, Vol.

269

, 126533, doi:

https://doi.org/10.1109/ACCESS.2020.3037185

Hsu

C.-Y.

Qiao

Wang

and

Chen

S.-T.

(

2020

), “

Machine learning modeling for failure detection of elevator doors by three-dimensional video monitoring

”,

IEEE Access

, Vol.

, pp.

211595

211609

, doi:

Kipf

T.N.

and

Welling

(

2016

), “

Semi-supervised classification with graph convolutional networks

”,

https://doi.org/10.48550/arXiv.1609.02907

, doi:

https://doi.org/10.2478/amns.2021.2.00003

Lan

Jiang

Qiu

Wan

Chen

and

Alam

(

2021

), “

Statistical analysis of typical elevator accidents in China from 2002 to 2019

”,

Applied Mathematics and Nonlinear Sciences

, Vol.

No.

, pp.

193

208

, doi:

and

Zhu

(

2020

), “

Spatial-temporal fusion graph neural networks for traffic flow forecasting

”,

In Proceedings of the AAAI Conference on Artificial Intelligence

https://doi.org/10.1609/aaai.v35i5.16542

, Vol.

No.

, pp.

4189

4196

, doi:

https://doi.org/10.1609/aaai.v32i1.11604

Han

and

(

2018

), “

Deeper insights into graph convolutional networks for semi-supervised learning

”,

Proceedings of the AAAI Conference on Artificial Intelligence

, Vol.

No.

, doi:

https://doi.org/10.3390/s19040972

Liu

Zhou

Zhao

Shen

and

Xiong

(

2019

), “

Fault diagnosis of rotating machinery under noisy environment conditions based on a 1-D convolutional autoencoder and 1-D convolutional neural network

”,

Sensors, MDPI AG

, Vol.

No.

, p.

972

, doi:

https://doi.org/10.1016/j.isatra.2025.09.041

Zhang

Xiao

and

Wang

(

2025

), “

A multi-scale convolution capsule network with data augmentation and attention mechanisms for elevator fault diagnosis

”,

ISA Transactions

, Vol.

167

, pp.

1873

1887

, doi:

https://doi.org/10.1016/j.inffus.2024.102780

Kim

B.-G.

Parameshachari

B.D.

Slowik

and

(

2025

), “

Large model-driven hyperscale healthcare data fusion analysis in complex multi-sensors

”,

Information Fusion

, Vol.

115

, 102780, doi:

https://doi.org/10.1016/j.ymssp.2025.112429

Niu

Yang

Jia

Jin

and

Luo

(

2025

), “

Performance evaluation of elevators using a novel hierarchical softmax regression model

”,

Mechanical Systems and Signal Processing

, Vol.

228

, 112429, doi:

https://doi.org/10.3390/math12010113

Pan

Xiang

Gong

and

Shen

(

2023

), “

Risk evaluation of elevators based on fuzzy theory and machine learning algorithms

”,

Mathematics

, Vol.

No.

, p.

113

, doi:

https://doi.org/10.3390/s24072135

Pan

Shao

Dai

Wei

Chen

and

Lin

(

2024

), “

Research on fault prediction method of elevator door system based on transfer learning

”,

Sensors

, Vol.

No.

, p.

2135

, doi:

https://doi.org/10.1016/j.knosys.2020.106561

Zhang

Jia

Mao

Wang

and

Song

(

2021

), “

Deep face clustering using residual graph convolutional network

”,

Knowledge-Based Systems

, Vol.

211

, 106561, doi:

https://doi.org/10.1109/jsen.2023.3332755

Rao

Zeng

and

Cheng

(

2024

), “

A novel interpretable model via algorithm unrolling for intelligent fault diagnosis of machinery

”,

IEEE Sensors Journal

, Vol.

No.

, pp.

495

505

, doi:

https://doi.org/10.11591/ijai.v7.i3.pp138-142

Rawat

A.S.

Rana

Kumar

and

Bagwari

(

2018

), “

Application of multi layer artificial neural network in the diagnosis system: a systematic review

”,

IAES International Journal of Artificial Intelligence (IJ-AI), Institute of Advanced Engineering and Science

, Vol.

No.

, p.

138

, doi:

https://doi.org/10.1109/78.650093

Schuster

and

Paliwal

K.K.

(

1997

), “

Bidirectional recurrent neural networks

”,

IEEE Transactions on Signal Processing

, Vol.

No.

, pp.

2673

2681

, doi:

https://doi.org/10.1016/j.engappai.2024.108846

Seo

Noh

Kang

Y.-J.

Lim

Ahn

Song

and

Kim

K.C.

(

2024

), “

Graph neural networks for anomaly detection and diagnosis in hydrogen extraction systems

”,

Engineering Applications of Artificial Intelligence

, Vol.

135

, 108846, doi:

https://doi.org/10.1109/ICCV48922.2021.01102

Sofianos

Sampieri

Franco

and

Galasso

(

2021

), “

Space-time-Separable graph convolutional network for pose forecasting

”,

2021 IEEE/CVF International Conference on Computer Vision (ICCV), Presented at the 2021 IEEE/CVF International Conference on Computer Vision (ICCV)

Montreal, QC

IEEE

, pp.

11189

11198

, doi:

https://doi.org/10.1016/j.inffus.2024.102708

Sun

and

Yin

(

2025

), “

Multi-sensor temporal-spatial graph network fusion empirical mode decomposition convolution for machine fault diagnosis

”,

Information Fusion

, Vol.

114

, 102708, doi:

https://doi.org/10.1088/1757-899x/428/1/012028

Wang

Leng

Zhang

Zhu

and

Zhang

(

2018

), “

MCU system-based intelligent high-speed elevator door operator fault analysis and research

”,

IOP Conference Series: Materials Science and Engineering

, Vol.

428

, 012028, doi:

https://doi.org/10.1038/s41598-024-78784-7

Wang

Chen

Xiao

Wang

and

(

2024

), “

Elevator fault diagnosis based on digital twin and PINNs-e-RGCN

”,

Scientific Reports

, Vol.

No.

, 30713, doi:

https://doi.org/10.1038/s41598-025-04620-1

Wang

Yin

She

Tong

Zhang

and

(

2025

), “

Bearing fault diagnosis for variable operating conditions based on KAN convolution and dual branch fusion attention

”,

Scientific Reports

, Vol.

No.

, 21442, doi:

Pan

Long

Jiang

Chang

and

Zhang

(

2020

), “

Connecting the dots: multivariate time series forecasting with graph neural networks

”,

In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

https://doi.org/10.1145/3394486.3403118

, pp.

753

763

, doi:

https://doi.org/10.1016/j.aei.2024.102687

Tan

Zhang

and

Dong

(

2024

), “

Joint mining of fluid knowledge and multi-sensor data for gas–water two-phase flow status monitoring and evolution analysis

”,

Advanced Engineering Informatics

, Vol.

, 102687, doi:

https://doi.org/10.1016/j.compeleceng.2024.109660

Xie

Zhang

and

Qiu

(

2024

), “

An elevator door anomaly detection method based on improved deep multi-sphere support vector data description

”,

Computers and Electrical Engineering

, Vol.

120

, 109660, doi:

https://doi.org/10.1007/s00521-025-11509-7

Xiao

Yao

Zhong

Xiao

and

(

2025a

), “

MB-ViT: MBConv vision transformer with time–frequency feature fusion for bearing fault diagnosis

”,

Neural Computing and Applications

, Vol.

No.

, pp.

22801

22825

, doi:

https://doi.org/10.1016/j.inffus.2025.103420

Xiao

Dornaika

Charafeddine

and

(

2025b

), “

Metric learning-enhanced semi-supervised graph convolutional network for multi-view learning

”,

Information Fusion

, Vol.

124

, 103420, doi:

https://doi.org/10.1016/j.knosys.2024.112781

and

(

2025

), “

Enhancing information fusion and feature selection efficiency via the PROMETHEE method for multi-source dynamic decision data sets

”,

Knowledge-Based Systems

, Vol.

309

, 112781, doi:

https://doi.org/10.1016/j.eswa.2023.121338

Yan

Shao

Wang

Zheng

and

Liu

(

2024

), “

LiConvFormer: a lightweight fault diagnosis framework using separable multiscale convolution and broadcast self-attention

”,

Expert Systems with Applications

, Vol.

237

, 121338, doi:

https://doi.org/10.1109/tie.2022.3176280

Yang

Tao

and

Zhong

(

2023a

), “

Compound Fault diagnosis of harmonic drives using deep capsule graph convolutional network

”,

IEEE Transactions on Industrial Electronics

, Vol.

No.

, pp.

4186

4195

, doi:

https://doi.org/10.1016/j.measurement.2023.113053

Yang

Ding

Geng

Jiang

and

Zou

(

2023b

), “

A multi-sensor mapping Bi-LSTM model of bridge monitoring data based on spatial-temporal attention mechanism

”,

Measurement

, Vol.

217

, 113053, doi:

https://doi.org/10.1088/1361-6501/ac5deb

Zhang

Gao

and

Shi

(

2022a

), “

Bearing fault diagnosis method based on multi-source heterogeneous information fusion

”,

Measurement Science and Technology

, Vol.

No.

, 075901, doi:

https://doi.org/10.3390/machines10040249

Zhang

and

Han

(

2022b

), “

Real-time motor fault diagnosis based on TCN and attention

”,

Machines

, Vol.

No.

, p.

249

MDPI AG

, doi:

https://doi.org/10.1109/tii.2022.3224979

Zhao

and

Jiao

(

2023

), “

A fault diagnosis method for rotating machinery based on CNN with mixed information

”,

IEEE Transactions on Industrial Informatics

, Vol.

No.

, pp.

9091

9101

, doi:

https://doi.org/10.1016/j.inffus.2024.102462

Zhou

and

Wang

(

2024

), “

MST-GAT: a multi-perspective spatial-temporal graph attention network for multi-sensor equipment remaining useful life prediction

”,

Information Fusion

, Vol.

110

, 102462, doi: