City Research Online

Cognitive Modelling with Burst Learning and Targeted Constrained Search

Koluman, C. (2024). Cognitive Modelling with Burst Learning and Targeted Constrained Search. (Unpublished Doctoral thesis, City, University of London)

Abstract

This work investigates nonrational iterative learning and searching in a stochastic setting, where the nature of the stochasticity is unknown. Such problems are difficult because at each iteration, the decision making model strives to make the best decision and simultaneously develops its representation of the underlying stochasticity. Outside of a nonrational context, Q-learning or stochastic approximation provide well-known methods for solving such problems subject to restrictions on the speed of learning rate decay and with the use of an infinite time horizon.

The nonrational context proposed here departs from the usual Q-learning approaches by stipulating that the learning rate decays exponentially. Additionally, a search technique named Constrained Single Unconstrained Double perturbation stochastic approximation (CSUD) is introduced. CSUD comprises a probabilistic hybrid of double- and single-sided simultaneous perturbation stochastic approximation, and is able to constrain not only input updates but also input perturbations. Using performance criteria targeting loss functions and input constraints, a nonrational CSUD search strategy is developed, in the sense of producing not globally unique but only satisficing outcomes.

Normal versus ventromedial prefrontal cortex (vmPFC) impaired results reported in the Iowa Gambling Task (IGT) are used to calibrate with CSUD search, a series of single-state exponential learning rate decay Qlearning models, culminating in the burst learning model, where the learning rate can be reset via an ‘emotion’ mediated signal. The key results obtained from the automatic calibration of the Q-learning models consist of: (1) high learning rate decay produces vmPFC impaired behaviour, and (2) for Q-learning models to match human IGT outcomes, exploration must be very high. The presence of high exploration is validated in corresponding human IGT outcomes by introducing an entropy based exploration index (EI). Four different Q-learning architectures including #-Greedy and Boltzmann exploration are considered, and it is found that no single exploration architecture can alone adequately explain human exploration.

Finally, the performance of nonrational CSUD in tuning a (rational) artificial neural network (ANN) is assessed. For a complex network, nonrational search strategy validation accuracy exceeds random search tuners, but lags behind that of Gaussian-mixture Bayesian tuners.

Publication Type: Thesis (Doctoral)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments: School of Science & Technology > Computer Science
School of Science & Technology > School of Science & Technology Doctoral Theses
Doctoral Theses
[thumbnail of Koluman thesis 2024 PDF-A.pdf]
Preview
Text - Accepted Version
Download (16MB) | Preview

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login