Reference:
F. Ruelens,
B.J. Claessens,
S. Vandael,
B. De Schutter,
R. Babuska, and
R. Belmans,
"Residential demand response of thermostatically controlled loads
using batch reinforcement learning," IEEE Transactions on Smart
Grid, vol. 8, no. 5, pp. 2149-2159, Sept. 2017.
Abstract:
Driven by recent advances in batch Reinforcement Learning (RL), this
paper contributes to the application of batch RL to demand response.
In contrast to conventional model-based approaches, batch RL
techniques do not require a system identification step, making them
more suitable for a large-scale implementation. This paper extends
fitted Q-iteration, a standard batch RL technique, to the situation
when a forecast of the exogenous data is provided. In general, batch
RL techniques do not rely on expert knowledge about the system
dynamics or the solution. However, if some expert knowledge is
provided, it can be incorporated by using the proposed policy
adjustment method. Finally, we tackle the challenge of finding an
open-loop schedule required to participate in the day-ahead market. We
propose a model-free Monte-Carlo method that uses a metric based on
the state-action value function or Q-function and we illustrate this
method by finding the day-ahead schedule of a heat-pump thermostat.
Our experiments show that batch RL techniques provide a valuable
alternative to model-based controllers and that they can be used to
construct both closed-loop and open-loop policies.