City Research Online - The divergence of reinforcement learning algorithms with value-iteration and function approximation

The divergence of reinforcement learning algorithms with value-iteration and function approximation

Fairbank, M. & Alonso, E. (2012). The divergence of reinforcement learning algorithms with value-iteration and function approximation. Paper presented at the The 2012 International Joint Conference on Neural Networks (IJCNN), 10-06-2012 - 15-06-2012, Brisbane, Australia. doi: 10.1109/IJCNN.2012.6252792

Abstract

This paper gives specific divergence examples of value-iteration for several major Reinforcement Learning and Adaptive Dynamic Programming algorithms, when using a function approximator for the value function. These divergence examples differ from previous divergence examples in the literature, in that they are applicable for a greedy policy, i.e. in a “value iteration” scenario. Perhaps surprisingly, with a greedy policy, it is also possible to get divergence for the algorithms TD(1) and Sarsa(1). In addition to these divergences, we also achieve divergence for the Adaptive Dynamic Programming algorithms HDP, DHP and GDHP.

Publication Type:	Conference or Workshop Item (Paper)
Additional Information:	© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Publisher Keywords:	Adaptive Dynamic Programming, Reinforcement Learning, Greedy Policy, Value Iteration, Divergence
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments:	School of Science & Technology > Computer Science

Preview

Text - Accepted Version
Download (406kB) | Preview

Official URL: http://dx.doi.org/10.1109/IJCNN.2012.6252792

Export

Downloads

Downloads per month over past year

View more statistics

Metadata

Altmetric

CORE (COnnecting REpositories)

Actions (login required)

Admin Login

Creators:	Fairbank, M. Alonso, E.
Event Title:	The 2012 International Joint Conference on Neural Networks (IJCNN)
Event Type:	Conference
Event Location:	Brisbane, Australia
Event Dates:	10-06-2012 - 15-06-2012
Status:	Published
Refereed:	Yes
Publisher:	IEEE Press
URI:	https://openaccess.city.ac.uk/id/eprint/5203
Date available in CRO:	17 Aug 2015 11:43
Date deposited:	27 July 2017
Dates:	Date Event 2012 Published