Search Sciweavers | Sciweavers

332 search results - page 25 / 67

» Ranking policies in discrete Markov decision processes

click to vote

ALDT
2009
Springer

142views Algorithms» more ALDT 2009»

Finding Best k Policies

14 years 2 months ago

Download www.cs.uky.edu

Abstract. An optimal probabilistic-planning algorithm solves a problem, usually modeled by a Markov decision process, by ﬁnding its optimal policy. In this paper, we study the k ...

Peng Dai, Judy Goldsmith

claim paper

Read More »

click to vote

ICML
2009
IEEE

148views Machine Learning» more ICML 2009»

Predictive representations for policy gradient in POMDPs

14 years 8 months ago

Download damas.ift.ulaval.ca

We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...

Abdeslam Boularias, Brahim Chaib-draa

claim paper

Read More »

click to vote

ATAL
2007
Springer

122views Intelligent Agents» more ATAL 2007»

Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies

14 years 2 months ago

Download teamcore.usc.edu

Distributed Partially Observable Markov Decision Problems (Distributed POMDPs) are a popular approach for modeling multi-agent systems acting in uncertain domains. Given the signi...

Pradeep Varakantham, Janusz Marecki, Yuichi Yabu, ...

claim paper

Read More »

click to vote

ICML
2004
IEEE

120views Machine Learning» more ICML 2004»

Utile distinction hidden Markov models

14 years 8 months ago

Download www.idsia.ch

This paper addresses the problem of constructing good action selection policies for agents acting in partially observable environments, a class of problems generally known as Part...

Daan Wierstra, Marco Wiering

claim paper

Read More »

click to vote

ICML
2006
IEEE

142views Machine Learning» more ICML 2006»

An intrinsic reward mechanism for efficient exploration

14 years 8 months ago

Download www-anw.cs.umass.edu

How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...

Özgür Simsek, Andrew G. Barto

claim paper

Read More »

« Prev « First page 25 / 67 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers