Interpretable Modelling of Benzene Variability Across Europe: from Shap Dependence to Temperature Regimes




Abstract:
This study presents an integrated machine learning (ML) and explainable artificial intelligence (XAI) framework for modelling daily benzene concentrations across 84 monitoring sites in Europe during February 2020-October 2021. A suite of ensemble tree-based algorithms was evaluated, with metaheuristic optimization used to refine model performance; the optimized LightGBM model achieved the best agreement with observations (R2≈0.85). To interpret the model, SHAP (Shapley Additive Explanations) was applied at both global and local levels. In addition to conventional SHAP analysis, SHAP-based clustering was used to identify distinct environmental settings, and the temperature dependence of SHAP values was parameterized using a segment-wise linear approximation. The results show strong spatial heterogeneity in benzene levels and in the magnitude of temperature importance across Europe. At the same time, the signed SHAP dependence for 2 m air temperature (T02m) exhibits a remarkably coherent cross-site structure: positive contributions under cold conditions, a progressive decline toward a zero crossing around 9.5 °C, a broad negative regime through mild and warm conditions, and a weak rebound at the highest temperatures. This indicates that temperature acts less as an isolated driver than as a compact state variable that reflects seasonal accumulation, dispersion, oxidation, and temperaturesensitive emission processes. The main novelty of the study is therefore not the use of SHAP alone, but the extraction of transferable regime parameters from SHAP structure. The proposed framework combines predictive skill with interpretability and can be extended to other predictors and interactions in large-scale air-quality applications.

CITATION:

IEEE format

T. Bezdan, G. Isibor, G. Jovanović, A. Stojić, M. Perišić, “Interpretable Modelling of Benzene Variability Across Europe: from Shap Dependence to Temperature Regimes,” in Sinteza 2026 - International Scientific Conference on Information Technology, Computer Science, and Data Science, Belgrade, Singidunum University, Serbia, 2026, pp. 226-232. doi:10.15308/Sinteza-2026-226-232

APA format

Bezdan, T., Isibor, G., Jovanović, G., Stojić, A., Perišić, M. (2026). Interpretable Modelling of Benzene Variability Across Europe: from Shap Dependence to Temperature Regimes. Paper presented at Sinteza 2026 - International Scientific Conference on Information Technology, Computer Science, and Data Science. doi:10.15308/Sinteza-2026-226-232

BibTeX format
Download

RefWorks Tagged format
Download