Evaluation of Reinforcement Learning for Optimal Control of Building Active and Passive Thermal Storage Inventory

[+] Author and Article Information
Simeng Liu

 Bes-Tech Inc., 3910 South Interstate Highway 35, Suite 225, Austin, TX 78704sliu@bes-tech.net

Gregor P. Henze

Department of Architectural Engineering,  University of Nebraska-Lincoln, Omaha, NE 68182ghenze@unl.edu

J. Sol. Energy Eng 129(2), 215-225 (Oct 31, 2006) (11 pages) doi:10.1115/1.2710491 History: Received May 22, 2005; Revised October 31, 2006

This paper describes an investigation of machine learning for supervisory control of active and passive thermal storage capacity in buildings. Previous studies show that the utilization of active or passive thermal storage, or both, can yield significant peak cooling load reduction and associated electrical demand and operational cost savings. In this study, a model-free learning control is investigated for the operation of electrically driven chilled water systems in heavy-mass commercial buildings. The reinforcement learning controller learns to operate the building and cooling plant based on the reinforcement feedback (monetary cost of each action, in this study) it receives for past control actions. The learning agent interacts with its environment by commanding the global zone temperature setpoints and thermal energy storage charging∕discharging rate. The controller extracts information about the environment based solely on the reinforcement signal; the controller does not contain a predictive or system model. Over time and by exploring the environment, the reinforcement learning controller establishes a statistical summary of plant operation, which is continuously updated as operation continues. The present analysis shows that learning control is a feasible methodology to find a near-optimal control strategy for exploiting the active and passive building thermal storage capacity, and also shows that the learning performance is affected by the dimensionality of the action and state space, the learning rate and several other factors. It is found that it takes a long time to learn control strategies for tasks associated with large state and action spaces.

Copyright © 2007 by American Society of Mechanical Engineers
Your Session has timed out. Please sign back in to continue.



Grahic Jump Location
Figure 6

Updating Q value with an artificial neural network

Grahic Jump Location
Figure 7

Comparison of all reinforcement learning with multitasks scenario

Grahic Jump Location
Figure 8

Learning of zone temperature setpoints Tsp

Grahic Jump Location
Figure 9

Learning of TES charge∕discharge rate u

Grahic Jump Location
Figure 1

Schematic of the reinforcement learning problem

Grahic Jump Location
Figure 2

Schematic of the learning controller

Grahic Jump Location
Figure 3

Cooling load profiles of EP model and Simulink model before and after calibration

Grahic Jump Location
Figure 4

Schedule of utility rate and internal heat gain

Grahic Jump Location
Figure 5

Optimal zone air temperature setpoint profile



Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In