City Research Online

Predictive Improvement through Latent Space Optimisation

McCaffrey, A., Alonso, E. ORCID: 0000-0002-3306-695X & Mondragón, E. ORCID: 0000-0003-4180-1261 (2025). Predictive Improvement through Latent Space Optimisation. Paper presented at the International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2025), 19-23 May 2025, Detroit, USA.

Abstract

Efficient exploration remains a challenge in reinforcement learning (RL), especially in stochastic or complex environments. We introduce Predictive Improvement through Latent space OpTimisation (PILOT), an intrinsically motivated RL algorithm that rewards actions leading to improvements in the agent’s environmental dynamics model. PILOT optimizes an intrinsic reward signal based on epistemic uncertainty reduction, thereby encouraging structured exploration. Our evaluations against benchmark intrinsic motivation algorithms in challenging environments show that PILOT achieves superior performance and exhibits robustness to stochastic distractions.

Publication Type: Conference or Workshop Item (Paper)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments: School of Science & Technology
School of Science & Technology > Computer Science
SWORD Depositor:
[thumbnail of McCaffreyAlonsoMondragon.pdf]
Preview
Text - Published Version
Available under License Creative Commons: Attribution International Public License 4.0.

Download (402kB) | Preview

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login