QUICR-Learning for Multi-Agent Coordination

14 years 2 months ago

Download www.aaai.org

Coordinating multiple agents that need to perform a sequence of actions to maximize a system level reward requires solving two distinct credit assignment problems. First, credit must be assigned for an action taken at time step t that results in a reward at time step t > t. Second, credit must be assigned for the contribution of agent i to the overall system performance. The first credit assignment problem is typically addressed with temporal difference methods such as Q-learning. The second credit assignment problem is typically addressed by creating custom reward functions. To address both credit assignment problems simultaneously, we propose the "Q Updates with Immediate Counterfactual Rewards-learning" (QUICR-learning) designed to improve both the convergence properties and performance of Q-learning in large multi-agent problems. QUICR-learning is based on previous work on single-time-step counterfactual rewards described by the collectives framework. Results on a tra...

Adrian K. Agogino, Kagan Tumer

Real-time Traffic

AAAI 2006 | Credit Assignment Problems | Distinct Credit Assignment | Intelligent Agents | Time Step |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2006
Where	AAAI
Authors	Adrian K. Agogino, Kagan Tumer

Comments (0)

Sciweavers

QUICR-Learning for Multi-Agent Coordination

AAAI 2006 | Credit Assignment Problems | Distinct Credit Assignment | Intelligent Agents | Time Step |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers