Search Sciweavers | Sciweavers

802 search results - page 140 / 161

» Experts in a Markov Decision Process

148

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 7 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

150

click to vote

NIPS
2001

158views Information Technology» more NIPS 2001»

Multiagent Planning with Factored MDPs

15 years 7 months ago

Download books.nips.cc

We present a principled and efficient planning algorithm for cooperative multiagent dynamic systems. A striking feature of our method is that the coordination and communication be...

Carlos Guestrin, Daphne Koller, Ronald Parr

claim paper

Read More »

154

click to vote

IJCAI
2003

137views Artificial Intelligence» more IJCAI 2003»

Approximating Optimal Policies for Agents with Limited Execution Resources

15 years 7 months ago

Download ai.stanford.edu

An agent with limited consumable execution resources needs policies that attempt to achieve good performance while respecting these limitations. Otherwise, an agent (such as a pla...

Dmitri A. Dolgov, Edmund H. Durfee

claim paper

Read More »

144

click to vote

SODA
2004
ACM

94views Algorithms» more SODA 2004»

Quantitative stochastic parity games

15 years 7 months ago

Download www.dcs.warwick.ac.uk

We study perfect-information stochastic parity games. These are two-player nonterminating games which are played on a graph with turn-based probabilistic transitions. A play resul...

Krishnendu Chatterjee, Marcin Jurdzinski, Thomas A...

claim paper

Read More »

134

click to vote

NIPS
2003

128views Information Technology» more NIPS 2003»

Distributed Optimization in Adaptive Networks

15 years 7 months ago

Download books.nips.cc

We develop a protocol for optimizing dynamic behavior of a network of simple electronic components, such as a sensor network, an ad hoc network of mobile devices, or a network of ...

Ciamac Cyrus Moallemi, Benjamin Van Roy

claim paper

Read More »

« Prev « First page 140 / 161 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers