Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
This paper describes Pinview, a content-based image retrieval system that exploits implicit relevance feedback during a search session. Pinview contains several novel methods that...
Peter Auer, Zakria Hussain, Samuel Kaski, Arto Kla...
Abstract. Approximate Policy Iteration (API) is a reinforcement learning paradigm that is able to solve high-dimensional, continuous control problems. We propose to exploit API for...
A robotic agent must coordinate its coupled concurrent behaviors to produce a coherent response to stimuli. Reinforcement learning has been used extensively in coordinating sensin...
Multi-robot learning faces all of the challenges of robot learning with all of the challenges of multiagent learning. There has been a great deal of recent research on multiagent ...