This study aims to examine employee turnover using machine learning models, focusing on both predictive performance and practical application. Its aim is to identify the main factors influencing turnover and evaluate the effectiveness of classification and regression trees (CART) alongside widely used models such as Random Forest, XGBoost, Logistic Regression, Naïve Bayes and Support Vector Machines.
The study uses the IBM human resources (HR) Analytics data set, which contains records from 15,000 employees, to analyse turnover patterns using multiple machine learning models, evaluated through Precision, Recall and F1-Score to balance accuracy and interpretability.
The results indicate that Random Forest and XGBoost achieved strong predictive performance, while CART provided comparable accuracy with the added advantage of decision clarity. The analysis highlights key drivers of turnover, including job satisfaction, compensation and tenure, reinforcing the importance of interpretable models in HR decision-making.
This study helps HR professionals build better retention strategies using interpretable models such as CART. It also compares CART with the most recommended models in previous research, giving practitioners a clear view of its practical value.
The value of this paper lies in its analysis of machine learning models for turnover prediction, providing researchers and HR practitioners with insights into the most effective models and the key factors influencing employee retention to help limit turnover.
