City Research Online - A Comparison of Learning Speed and Ability to Cope Without Exploration between DHP and TD(0)

A Comparison of Learning Speed and Ability to Cope Without Exploration between DHP and TD(0)

Fairbank, M. & Alonso, E. (2012). A Comparison of Learning Speed and Ability to Cope Without Exploration between DHP and TD(0). Paper presented at the IEEE International Joint Conference on Neural Networks (IEEE IJCNN 2012), 1783-1789, 10-15-2012, Brisbane, Australia. doi: 10.1109/IJCNN.2012.6252569

Abstract

This paper demonstrates the principal motivations for Dual Heuristic Dynamic Programming (DHP) learning methods for use in Adaptive Dynamic Programming and Reinforcement Learning, in continuous state spaces: that of automatic local exploration, improved learning speed and the ability to work without stochastic exploration in deterministic environments. In a simple experiment, the learning speed of DHP is shown to be around 1700 times faster than TD(0). DHP solves the problem without any exploration, whereas TD(0) cannot solve it without explicit exploration. DHP requires knowledge of, and differentiability of, the environment's model functions. This paper aims to illustrate the advantages of DHP when these two requirements are satisfied.

Publication Type:	Conference or Workshop Item (Paper)
Additional Information:	© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Publisher Keywords:	Dual Heuristic Dynamic Programming, DHP, Adaptive Dynamic Programming, Reinforcement Learning Heuristic Dynamic Programming; DHP; Adaptive Dynamic Programming; Reinforcement Learning
Subjects:	L Education > L Education (General) Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments:	School of Science & Technology > Computer Science

Preview

Text - Accepted Version
Download (244kB) | Preview

Official URL: http://dx.doi.org/10.1109/IJCNN.2012.6252569

Export

Downloads

Downloads per month over past year

View more statistics

Metadata

Altmetric

CORE (COnnecting REpositories)

Actions (login required)

Admin Login

Creators:	Fairbank, M. Alonso, E.
Event Title:	IEEE International Joint Conference on Neural Networks (IEEE IJCNN 2012), 1783-1789
Event Type:	Conference
Event Location:	Brisbane, Australia
Event Dates:	10-15-2012
Status:	Published
Refereed:	Yes
Publisher:	IEEE Press
URI:	https://openaccess.city.ac.uk/id/eprint/5201
Date available in CRO:	10 Jul 2015 13:57
Date deposited:	27 July 2017
Dates:	Date Event 2012 Published