Search Sciweavers | Sciweavers

332 search results - page 62 / 67

» Ranking policies in discrete Markov decision processes

click to vote

AAAI
2007

102views Intelligent Agents» more AAAI 2007»

Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games

13 years 10 months ago

Download www.cs.cmu.edu

In timed, zero-sum games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermedi...

Colin McMillen, Manuela M. Veloso

claim paper

Read More »

click to vote

NIPS
2007

146views Information Technology» more NIPS 2007»

Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

13 years 9 months ago

Download books.nips.cc

We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...

Ambuj Tewari, Peter L. Bartlett

claim paper

Read More »

click to vote

NIPS
2007

170views Information Technology» more NIPS 2007»

What makes some POMDP problems easy to approximate?

13 years 9 months ago

Download books.nips.cc

Point-based algorithms have been surprisingly successful in computing approximately optimal solutions for partially observable Markov decision processes (POMDPs) in high dimension...

David Hsu, Wee Sun Lee, Nan Rong

claim paper

Read More »

click to vote

AUTOMATICA
2007

124views more AUTOMATICA 2007»

Motion planning in uncertain environments with vision-like sensors

13 years 8 months ago

Download dnc.tamu.edu

In this work we present a methodology for intelligent path planning in an uncertain environment using vision like sensors, i.e., sensors that allow the sensing of the environment ...

Suman Chakravorty, John L. Junkins

claim paper

Read More »

click to vote

JMLR
2006

190views more JMLR 2006»

Causal Graph Based Decomposition of Factored MDPs

13 years 8 months ago

Download www-anw.cs.umass.edu

We present Variable Influence Structure Analysis, or VISA, an algorithm that performs hierarchical decomposition of factored Markov decision processes. VISA uses a dynamic Bayesia...

Anders Jonsson, Andrew G. Barto

claim paper

Read More »

« Prev « First page 62 / 67 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers