Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...
Recent advances in technology allow multi-agent systems to be deployed in cooperation with or as a service for humans. Typically, those systems are designed assuming individually ...
Our goal is to provide learning mechanisms to game agents so they are capable of adapting to new behaviors based on the actions of other agents. We introduce a new on-line reinfor...
Traditional hop-by-hop dynamic routing makes inefficient use of network resources as it forwards packets along already congested shortest paths while uncongested longer paths may b...
Minsoo Lee, Xiaohui Ye, Dan Marconett, Samuel John...
Abstract— We consider the problem of apprenticeship learning when the expert’s demonstration covers only a small part of a large state space. Inverse Reinforcement Learning (IR...