Imitation and Reinforcement Learning with Heterogeneous Actions
CPM Report No.: 00-67
By:Bob Price and
Craig Boutilier
Date: 2nd May 2000
A Paper at: The "Starting from
Society" symposium at ASIB'2000
convention, Birmingham University, 16th-19th April 2000.
Also published as: Bob Price and Craig Boutilier (2000), "Imitation
and Reinforcement Learning with Heterogeneous Actions", in the Proceedings of
the AISB'00 Symposium on Starting from Society - the Application of Social
Analogies to Computational Systems, Birmingham, UK: AISB, 85-92. (ISBN 1 902956
13 8)
Abstract
We study the problem of accelerating reinforcement learning through the observation and
implicit imitation of expert agents (mentors) acting in the same domain. In this
paper, we consider problems that arise when the learner and mentor have
heterogeneous actions. We extend an earlier implicit imitation model to allow for feasibility
testing (determining whether a specific mentor action can be duplicated) and
repair (discovering a "plan" that simulates a mentor's trajectory) and
demonstrate empirically that both of these components allow learning
agents to learn much more readily than standard RL agents and implicit imitation agents
without these extended capabilities.
Accessible as: