Sciweavers

332 search results - page 62 / 67
» Ranking policies in discrete Markov decision processes
Sort
View
AAAI
2007
13 years 10 months ago
Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games
In timed, zero-sum games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermedi...
Colin McMillen, Manuela M. Veloso
NIPS
2007
13 years 9 months ago
Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Ambuj Tewari, Peter L. Bartlett
NIPS
2007
13 years 9 months ago
What makes some POMDP problems easy to approximate?
Point-based algorithms have been surprisingly successful in computing approximately optimal solutions for partially observable Markov decision processes (POMDPs) in high dimension...
David Hsu, Wee Sun Lee, Nan Rong
AUTOMATICA
2007
124views more  AUTOMATICA 2007»
13 years 8 months ago
Motion planning in uncertain environments with vision-like sensors
In this work we present a methodology for intelligent path planning in an uncertain environment using vision like sensors, i.e., sensors that allow the sensing of the environment ...
Suman Chakravorty, John L. Junkins
JMLR
2006
190views more  JMLR 2006»
13 years 8 months ago
Causal Graph Based Decomposition of Factored MDPs
We present Variable Influence Structure Analysis, or VISA, an algorithm that performs hierarchical decomposition of factored Markov decision processes. VISA uses a dynamic Bayesia...
Anders Jonsson, Andrew G. Barto