City Research Online

Clipping in Neurocontrol by Adaptive Dynamic Programming

Fairbank, M., Prokhorov, D. and Alonso, E. (2014). Clipping in Neurocontrol by Adaptive Dynamic Programming. IEEE Transactions on Neural Networks and Learning Systems, 25(10), pp. 1909-1920. doi: 10.1109/TNNLS.2014.2297991

Abstract

In adaptive dynamic programming, neurocontrol, and reinforcement learning, the objective is for an agent to learn to choose actions so as to minimize a total cost function. In this paper, we show that when discretized time is used to model the motion of the agent, it can be very important to do clipping on the motion of the agent in the final time step of the trajectory. By clipping, we mean that the final time step of the trajectory is to be truncated such that the agent stops exactly at the first terminal state reached, and no distance further. We demonstrate that when clipping is omitted, learning performance can fail to reach the optimum, and when clipping is done properly, learning performance can improve significantly. The clipping problem we describe affects algorithms that use explicit derivatives of the model functions of the environment to calculate a learning gradient. These include backpropagation through time for control and methods based on dual heuristic programming. However, the clipping problem does not significantly affect methods based on heuristic dynamic programming, temporal differences learning, or policy-gradient learning algorithms.

Publication Type: Article
Additional Information: (c) 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Publisher Keywords: Clipping, Neurocontrol, Dual Heuristic Programming (DHP), Backpropagation Through Time (BPTT), Value-Gradient Learning
Subjects: B Philosophy. Psychology. Religion > BF Psychology
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
R Medicine > RC Internal medicine > RC0321 Neuroscience. Biological psychiatry. Neuropsychiatry
Departments: School of Mathematics, Computer Science & Engineering > Computer Science
URI: http://openaccess.city.ac.uk/id/eprint/5181
[img]
Preview
Text - Accepted Version
Download (608kB) | Preview

Export

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login