Optimizing pollutant exposure, energy consumption, and thermal comfort in a house via deep reinforcement learning control
Source
Journal of Building Engineering
Date Issued
2025-11-15
Author(s)
Abstract
Efficient indoor environment management is challenging due to the complex interdependent nature of pollutant levels, energy use, and thermal comfort. While data-driven and physics-based algorithms have been explored to optimize these conflicting objectives, their real-world integration is constrained by modeling, computational, and learning challenges. Reinforcement learning (RL) has gained attention as a potential substitute in such scenarios. This work developed a reward-based deep RL agent for optimizing pollutant exposure, energy consumption, and thermal comfort by controlling ventilation rates and set temperature of the heating, ventilation, and air conditioning (HVAC) unit. Two RL agents were developed: Agent 1 controlled only the ventilation rate, and Agent 2 controlled both the ventilation rate and the HVAC set temperature. A reward function with weighted combinations of energy (W<inf>1</inf>), pollutant exposure (W<inf>2</inf>), and thermal discomfort (W<inf>3</inf>) enabled specific control strategies with and without an HVAC filter. RL agents’ reliability under varying pollutant emission scenarios was evaluated against a physics-based dynamic optimization (DynOpt) strategy. Agent 1 achieved ∼26 %–∼133 % higher normalized exposure reduction (NER) than DynOpt at a W<inf>1</inf>/W<inf>2</inf> ratio of 1/3. On the other hand, Agent 2 (W<inf>1</inf>/W<inf>2</inf> = 1/10, and W<inf>3</inf> = 1 or 10) dynamically adjusted the set temperature between 25 °C and 27<sup>o</sup>C, achieving NER values ∼17 %–∼373 % higher than Agent 1 across all reward combinations. The proposed algorithm can be deployed in the field through low-cost microcontrollers and monitors, presenting a potentially deployable solution for healthy indoor environments with minimal environmental impacts.
Keywords
Deep reinforcement learning | Dynamic set temperature | Energy-exposure-comfort trade-off | Indoor air pollution control | Variable indoor-outdoor air exchange rate
