Imitation and Reinforcement Learning with Heterogeneous Actions

CPM Report No.: 00-67
By:Bob Price and Craig Boutilier
Date: 2nd May 2000

A Paper at: The "Starting from Society" symposium at ASIB'2000 convention, Birmingham University, 16th-19th April 2000.

Also published as: Bob Price and Craig Boutilier (2000), "Imitation and Reinforcement Learning with Heterogeneous Actions", in the Proceedings of the AISB'00 Symposium on Starting from Society - the Application of Social Analogies to Computational Systems, Birmingham, UK: AISB, 85-92. (ISBN 1 902956 13 8)


Abstract

We study the problem of accelerating reinforcement learning through the observation and implicit imitation of expert agents (mentors) acting in the same domain. In this paper, we consider problems that arise when the learner and mentor have heterogeneous actions. We extend an earlier implicit imitation model to allow for feasibility testing (determining whether a specific mentor action can be duplicated) and repair (discovering a "plan" that simulates a mentor's trajectory) and demonstrate empirically that both of these components allow learning agents to learn much more readily than standard RL agents and implicit imitation agents without these extended capabilities.


Accessible as: