Search Sciweavers | Sciweavers

135 search results - page 11 / 27

» Bounded Parameter Markov Decision Processes

click to vote

Publication

233views

Sparse reward processes

12 years 6 months ago

Download arxiv.org

We introduce a class of learning problems where the agent is presented with a series of tasks. Intuitively, if there is relation among those tasks, then the information gained duri...

Christos Dimitrakakis

posted by olethros

Read More »

click to vote

NIPS
2008

116views Information Technology» more NIPS 2008»

Particle Filter-based Policy Gradient in POMDPs

13 years 9 months ago

Download eprints.pascal-network.org

Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...

Pierre-Arnaud Coquelin, Romain Deguest, Rém...

claim paper

Read More »

click to vote

ML
2002
ACM

121views Machine Learning» more ML 2002»

Near-Optimal Reinforcement Learning in Polynomial Time

13 years 7 months ago

Download www.cis.upenn.edu

We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...

Michael J. Kearns, Satinder P. Singh

claim paper

Read More »

click to vote

ISAAC
2010
Springer

243views Algorithms» more ISAAC 2010»

Lower Bounds for Howard's Algorithm for Finding Minimum Mean-Cost Cycles

13 years 5 months ago

Download www.daimi.au.dk

Howard's policy iteration algorithm is one of the most widely used algorithms for finding optimal policies for controlling Markov Decision Processes (MDPs). When applied to we...

Thomas Dueholm Hansen, Uri Zwick

claim paper

Read More »

click to vote

AAAI
2006

157views Intelligent Agents» more AAAI 2006»

Compact, Convex Upper Bound Iteration for Approximate POMDP Planning

13 years 9 months ago

Download www.aaai.org

Partially observable Markov decision processes (POMDPs) are an intuitive and general way to model sequential decision making problems under uncertainty. Unfortunately, even approx...

Tao Wang, Pascal Poupart, Michael H. Bowling, Dale...

claim paper

Read More »

« Prev « First page 11 / 27 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers