Urban ozone variability using automated machine learning: inference from different feature importance schemes

Show simple item record

dc.contributor.author Nath, Sankar Jyoti
dc.contributor.author Girach, Imran A.
dc.contributor.author Harithasree, S.
dc.contributor.author Bhuyan, Kalyan
dc.contributor.author Ojha, Narendra
dc.contributor.author Kumar, Manish
dc.coverage.spatial United Kingdom
dc.date.accessioned 2024-03-28T08:24:32Z
dc.date.available 2024-03-28T08:24:32Z
dc.date.issued 2024-04
dc.identifier.citation Nath, Sankar Jyoti; Girach, Imran A.; Harithasree, S.; Bhuyan, Kalyan; Ojha, Narendra and Kumar, Manish, "Urban ozone variability using automated machine learning: inference from different feature importance schemes", Environmental Monitoring and Assessment, DOI: 10.1007/s10661-024-12549-7, vol. 196, no. 4, Apr. 2024.
dc.identifier.issn 0167-6369
dc.identifier.issn 1573-2959
dc.identifier.uri https://doi.org/10.1007/s10661-024-12549-7
dc.identifier.uri https://repository.iitgn.ac.in/handle/123456789/9913
dc.description.abstract Tropospheric ozone is an air pollutant at the ground level and a greenhouse gas which significantly contributes to the global warming. Strong anthropogenic emissions in and around urban environments enhance surface ozone pollution impacting the human health and vegetation adversely. However, observations are often scarce and the factors driving ozone variability remain uncertain in the developing regions of the world. In this regard, here, we conducted machine learning (ML) simulations of ozone variability and comprehensively examined the governing factors over a major urban environment (Ahmedabad) in western India. Ozone precursors (NO2, NO, CO, C5H8 and CH2O) from the CAMS (Copernicus Atmosphere Monitoring Service) reanalysis and meteorological parameters from the ERA5 (European Centre for Medium-Range Weather Forecast’s (ECMWF) fifth-generation reanalysis) were included as features in the ML models. Automated ML (AutoML) fitted the deep learning model optimally and simulated the daily ozone with root mean square error (RMSE) of ~2 ppbv reproducing 84–88% of variability. The model performance achieved here is comparable to widely used ML models (RF—Random Forest and XGBoost—eXtreme Gradient Boosting). Explainability of the models is discussed through different schemes of feature importance, including SAGE (Shapley Additive Global importancE) and permutation importance. The leading features are found to be different from different feature importance schemes. We show that urban ozone could be simulated well (RMSE = 2.5 ppbv and R2 = 0.78) by considering first four leading features, from different schemes, which are consistent with ozone photochemistry. Our study underscores the need to conduct science-informed analysis of feature importance from multiple schemes to infer the roles of input variables in ozone variability. AutoML-based studies, exploiting potentials of long-term observations, can strongly complement the conventional chemistry-transport modelling and can also help in accurate simulation and forecast of urban ozone.
dc.description.statementofresponsibility by Sankar Jyoti Nath, Imran A. Girach, S. Harithasree, Kalyan Bhuyan, Narendra Ojha and Manish Kumar
dc.format.extent vol. 196, no. 4
dc.language.iso en_US
dc.publisher Springer
dc.subject Air quality
dc.subject Modelling
dc.subject Ozone
dc.subject Precursors
dc.subject Meteorology
dc.subject Artificial intelligence
dc.subject Machine learning
dc.subject AutoML
dc.subject XGBoost
dc.subject Random Forest
dc.subject Air pollution
dc.subject Atmospheric chemistry
dc.title Urban ozone variability using automated machine learning: inference from different feature importance schemes
dc.type Article
dc.relation.journal Environmental Monitoring and Assessment


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


Browse

My Account