This review critically examines the use of machine learning (ML) for calibrating low-cost air quality sensors (LCSs), which, despite their growing deployment for high-resolution monitoring, suffer from significant accuracy limitations. This paper aims to synthesize recent advances, evaluate methodological strengths and weaknesses and clarify ongoing debates regarding the reliability, transparency and generalizability of ML-based calibration strategies.
Drawing on more than 90 peer-reviewed studies published between 2013 and early 2024, identified through structured searches in Web of Science, Scopus, IEEE Xplore and Google Scholar using combinations of keywords such as “low-cost air quality sensor,” “machine learning calibration,” “drift correction” and “transferability,” this review surveys calibration approaches applied to major sensor types (optical, electrochemical, metal-oxide semiconductor and nondispersive infrared) across pollutants such as particulate matter, ozone and nitrogen dioxide. Both traditional regressions and advanced ML models are analyzed. The review highlights methodological practices, performance benchmarks and controversies regarding overfitting, model transferability and the role of ancillary variables.
Evidence demonstrates that ML calibration can reduce error metrics by more than 50% and raise correlation with reference monitors to R2 values exceeding 0.8–0.9. For optical particulate matter sensors, cross-study evaluations commonly report postcalibration R2 in the 0.8–0.95 range with slopes close to unity under long-term colocation, whereas calibrated electrochemical gas sensors for NO2 and O3 more typically achieve R2 between about 0.6 and 0.9, with larger site-to-site variability. Case studies from diverse environments illustrate how neural networks and gradient boosting often outperform simpler models when sufficient training data are available, while regression approaches remain robust and comparatively stable under limited-data conditions. However, challenges such as sensor drift, lack of standardized protocols and limited generalizability across sites persist. Transparency concerns, particularly with black-box models, further complicate adoption in regulatory settings.
By synthesizing results across pollutants, algorithms and deployment contexts, this review offers a balanced appraisal of ML’s potential and limitations for LCS calibration. It identifies best practices, emphasizes the importance of training data and validation strategies, and underscores emerging hybrid methods that integrate sensor physics with data-driven models. The analysis provides guidance for researchers, practitioners and policymakers seeking to enhance the reliability and scalability of low-cost sensor networks for air quality management.
