Leveraging machine learning and geospatial data for air quality modeling over the Indian region
Source
Geospatial technologies: environmental and climate science applications and challenges
Date Issued
2026-01-01
Author(s)
Harithasree, S.
Girach, Imran
Chakraborti, Archisman
Vaishya, Aditya
Kumar, Manish
Tiwari, Rajarshi
Ojha, Narendra
Abstract
Air pollution significantly impacts public health, agriculture, climate, and the economy. Nevertheless, ground-based observations have remained sparse, and the conventional process-based models exhibit considerable uncertainties in South Asia. In India, air quality modeling is particularly challenging due to complex topography, dramatic variations in emissions, and strong meteorological variability. In recent years, novel artificial intelligence and machine learning (AI/ML) techniques have been increasingly applied to simulate air quality and assess contributing factors. ML models trained on reanalysis datasets (ERA5 and CAMS) have successfully reproduced the urban ozone (O3) variability in Indian stations and revealed significant roles of variations in air temperature and boundary layer height, besides the precursors. ML has also been combined with satellite-based observations to simulate variability in aerosol optical depth (AOD) as measured by NASA’s ground-based network (AERONET). High-resolution long-term geospatial maps of fine particulate matter (PM2.5) have been generated by integrating satellite and ground-based data with ML. The validated ML models are being applied to fill the data gaps, thereby helping to reduce uncertainties in the air pollution impacts. Feature importance analyses can assist in designing field measurements more strategically. ML models can further serve as a computationally efficient yet accurate alternative to conventional statistical/dynamical downscaling over India. Integration of multiple satellite observations, high-resolution regional modeling, and ground-based observations is crucial for informing on street-level air quality. In a nutshell, AI/ML has been demonstrated to transform air quality modeling, and refining these models with more comprehensive data will enhance prediction capabilities and support framing more effective mitigation policies.
