Accurate and timely fault detection was crucial for ensuring the safety and reliability of jet engines. Traditional model struggle to capture non-linear fault patterns high-dimensional sensor data produced by modern engines.
This paper presents a comprehensive comparative study of six machine learning models XGBoost, random forest (RF), logistic regression (LR), k-nearest neighbours (KNN), support vector machine (SVM) and multi-layer perceptron (MLP). The models used to identify and classify the jet engine faults based on sensor readings. To enable this analysis, a realistic data set which includes the normal operating conditions and two distinct fault modes such as compressor high pressure turbine fault and other faults category. The data set includes sensor noise, fault severity variations and physically meaningful feature interactions.
The results demonstrate that gradient boosted trees (XGBoost), RF and LR achieve near-perfect fault detection accuracy of 98% with no missed detections or false alarms. These models successfully identified the decision boundaries that separate nominal and faulty engine states across the high-dimensional feature space. On instance-based learning with KNN shows good but substantially lower performance with occasional missed detections, concluding an 91% accuracy. SVM and MLP proved unsuitable for this classification task due to suboptimal hyperparameters and model capacity limitations. All the models analysed at a granular level to determine the receiver operating characteristic curves and confusion matrices. XGBoost, RF and LR exhibit a strong capability to detect anomalies. Feature importance estimates the role of intuitive physical parameters, such as exhaust gas temperature and engine speeds, in the fault identification process.
Both tree-based models and LR were the promise of data-driven techniques for reliable, high-precision engine fault detection. However, SVM and MLP reported a poor outcome.
