An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time
Fairbank, M., Alonso, E. & Prokhorov, D. (2013). An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time. IEEE Transactions on Neural Networks and Learning Systems, 24(12), pp. 2088-2100. doi: 10.1109/tnnls.2013.2271778
Abstract
We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, VGL(λ), and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm.
Publication Type: | Article |
---|---|
Additional Information: | (c) 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. |
Publisher Keywords: | Adaptive Dynamic Programming, Dual Heuristic Programming, Value-Gradient Learning, Backpropagation Through Time, Neural Networks |
Subjects: | B Philosophy. Psychology. Religion > BF Psychology Q Science > QA Mathematics > QA75 Electronic computers. Computer science R Medicine > RC Internal medicine > RC0321 Neuroscience. Biological psychiatry. Neuropsychiatry |
Departments: | School of Science & Technology > Computer Science |
Related URLs: | |
SWORD Depositor: |
Download (582kB) | Preview
Export
Downloads
Downloads per month over past year