Sciweavers

262 search results - page 34 / 53
» Bounded-Parameter Partially Observable Markov Decision Proce...
Sort
View
CONNECTION
2008
178views more  CONNECTION 2008»
13 years 10 months ago
Spoken language interaction with model uncertainty: an adaptive human-robot interaction system
Spoken language is one of the most intuitive forms of interaction between humans and agents. Unfortunately, agents that interact with people using natural language often experienc...
Finale Doshi, Nicholas Roy
ICRA
2008
IEEE
173views Robotics» more  ICRA 2008»
14 years 4 months ago
Bayesian reinforcement learning in continuous POMDPs with application to robot navigation
— We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially Observable Mark...
Stéphane Ross, Brahim Chaib-draa, Joelle Pi...
ATAL
2009
Springer
14 years 4 months ago
SarsaLandmark: an algorithm for learning in POMDPs with landmarks
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Michael R. James, Satinder P. Singh
ICMLA
2009
13 years 7 months ago
Sensitivity Analysis of POMDP Value Functions
In sequential decision making under uncertainty, as in many other modeling endeavors, researchers observe a dynamical system and collect data measuring its behavior over time. The...
Stéphane Ross, Masoumeh T. Izadi, Mark Merc...
NN
2010
Springer
125views Neural Networks» more  NN 2010»
13 years 8 months ago
Parameter-exploring policy gradients
We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in paramet...
Frank Sehnke, Christian Osendorfer, Thomas Rü...