Nonparametric Bayesian Learning of Other Agents? Policies in Interactive POMDPs

10 years 3 months ago

Download mipc.inf.ed.ac.uk

We consider an autonomous agent facing a partially observable, stochastic, multiagent environment where the unknown policies of other agents are represented as ﬁnite state controllers (FSCs). We show how an agent can (i) learn the FSCs of the other agents, and (ii) exploit these models during interactions. To separate the issues of off-line versus on-line learning we consider here an off-line two-phase approach. During the ﬁrst phase the agent observes as the other player(s) are interacting with the environment (the observations may be imperfect and the learning agent is not taking part in the interaction.) The collected data is used to learn an ensemble of FSCs that explain the behavior of the other agent(s) using a Bayesian non-parametric (BNP) approach. We verify the quality of the learned models during the second phase by allowing the agent to compute its own optimal policy and interact with the observed agent. The optimal policy for the learning agent is obtained by solving a...

Alessandro Panella, Piotr J. Gmytrasiewicz

Real-time Traffic

ATAL 2015 | Intelligent Agents |

claim paper

Added	16 Apr 2016
Updated	16 Apr 2016
Type	Journal
Year	2015
Where	ATAL
Authors	Alessandro Panella, Piotr J. Gmytrasiewicz

Sciweavers

Nonparametric Bayesian Learning of Other Agents? Policies in Interactive POMDPs

ATAL 2015 | Intelligent Agents |

Explore & Download

Productivity Tools

Sciweavers