Sciweavers

779 search results - page 11 / 156
» Reinforcement Using Supervised Learning for Policy Generaliz...
Sort
View
ICML
2002
IEEE
14 years 8 months ago
Hierarchically Optimal Average Reward Reinforcement Learning
Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...
Mohammad Ghavamzadeh, Sridhar Mahadevan
ACL
2008
13 years 9 months ago
Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation
We address two problems in the field of automatic optimization of dialogue strategies: learning effective dialogue strategies when no initial data or system exists, and evaluating...
Verena Rieser, Oliver Lemon

Publication
222views
14 years 4 months ago
Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration
Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...
Christos Dimitrakakis, Michail G. Lagoudakis
IJAIT
2008
146views more  IJAIT 2008»
13 years 7 months ago
Learning to Behave in Space: a Qualitative Spatial Representation for Robot Navigation with Reinforcement Learning
ion mechanism to create a representation of space consisting of the circular order of detected landmarks and the relative position of walls towards the agent's moving directio...
Lutz Frommberger
ICRA
2009
IEEE
138views Robotics» more  ICRA 2009»
14 years 2 months ago
Which landmark is useful? Learning selection policies for navigation in unknown environments
Abstract— In general, a mobile robot that operates in unknown environments has to maintain a map and has to determine its own location given the map. This introduces significant...
Hauke Strasdat, Cyrill Stachniss, Wolfram Burgard