Search Sciweavers | Sciweavers

337 search results - page 27 / 68

» Mean-Variance Optimization in Markov Decision Processes

click to vote

ICML
2006
IEEE

101views Machine Learning» more ICML 2006»

Qualitative reinforcement learning

14 years 8 months ago

Download www.cs.uiuc.edu

When the transition probabilities and rewards of a Markov Decision Process are specified exactly, the problem can be solved without any interaction with the environment. When no s...

Arkady Epshteyn, Gerald DeJong

claim paper

Read More »

click to vote

ICML
2006
IEEE

131views Machine Learning» more ICML 2006»

PAC model-free reinforcement learning

14 years 8 months ago

Download cseweb.ucsd.edu

For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...

Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...

claim paper

Read More »

click to vote

ICML
2006
IEEE

142views Machine Learning» more ICML 2006»

An intrinsic reward mechanism for efficient exploration

14 years 8 months ago

Download www-anw.cs.umass.edu

How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...

Özgür Simsek, Andrew G. Barto

claim paper

Read More »

click to vote

FLAIRS
2001

140views Artificial Intelligence» more FLAIRS 2001»

Probabilistic Planning for Behavior-Based Robots

13 years 9 months ago

Download www.atrash.com

Partially Observable Markov Decision Process models (POMDPs) have been applied to low-level robot control. We show how to use POMDPs differently, namely for sensorplanning in the ...

Amin Atrash, Sven Koenig

claim paper

Read More »

click to vote

ICIP
2008
IEEE

134views Image Processing» more ICIP 2008»

A new theoretic framework for cross-layer optimization

14 years 2 months ago

Download medianetlab.ee.ucla.edu

Cross-layer optimization aims at improving the performance of network users operating in a time-varying, error-prone wireless environment. However, current solutions often rely on...

Fangwen Fu, Mihaela van der Schaar

claim paper

Read More »

« Prev « First page 27 / 68 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers