City Research Online - Back to optimality: a formal framework to express the dynamics of learning optimal behavior

Back to optimality: a formal framework to express the dynamics of learning optimal behavior

Alonso, E., Fairbank, M. & Mondragon, E. ORCID: 0000-0003-4180-1261 (2015). Back to optimality: a formal framework to express the dynamics of learning optimal behavior. Adaptive Behavior, 23(4), pp. 206-215. doi: 10.1177/1059712315589355

Abstract

Whether animals behave optimally is an open question of great importance, both theoretically and in practice. Attempts to answer this question focus on two aspects of the optimization problem, the quantity to be optimized and the optimization process itself. In this paper, we assume the abstract concept of cost as the quantity to be minimized and propose a reinforcement learning algorithm, called Value-Gradient Learning (VGL), as a computational model of behavior optimality. We prove that, unlike standard models of Reinforcement Learning, Temporal Difference in particular, VGL is guaranteed to converge to optimality under certain conditions. The core of the proof is the mathematical equivalence of VGL and Pontryagin’s Minimum Principle, a well-known optimization technique in systems and control theory. Given the similarity between VGL’s formulation and regulatory models of behavior, we argue that our algorithm may provide psychologists with a tool to formulate such models in optimization terms.

Publication Type:	Article
Publisher Keywords:	Optimality; Principle of Least Action; bliss point; reinforcement learning; Value-Gradient Learning
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments:	School of Science & Technology > Computer Science
SWORD Depositor:	Symplectic Administrator

Preview

Text - Accepted Version
Download (966kB) | Preview

Export

Downloads

Downloads per month over past year

View more statistics

Metadata

Altmetric

CORE (COnnecting REpositories)

Actions (login required)

Admin Login

Creators:	Alonso, E. Fairbank, M. Mondragon, E. ORCID: 0000-0003-4180-1261
Status:	Published
Refereed:	Yes
Journal or Publication Title:	Adaptive Behavior
Publisher:	SAGE Publications
ISSN:	1059-7123
e-ISSN:	1741-2633
URI:	https://openaccess.city.ac.uk/id/eprint/12258
Date available in CRO:	30 Jul 2015 09:32
Date deposited:	28 July 2017
Dates:	Date Event August 2015 Published