Search Sciweavers | Sciweavers

16 search results - page 2 / 4

» On Average Versus Discounted Reward Temporal-Difference Lear...

202

click to vote

COLT
2007
Springer

143views Machine Learning» more COLT 2007»

Bounded Parameter Markov Decision Processes with Average Reward Criterion

16 years 1 months ago

Download ttic.uchicago.edu

Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in the parameters of a Markov Decision Process (MDP). Unlike the case of an MDP, t...

Ambuj Tewari, Peter L. Bartlett

claim paper

Read More »

171

click to vote

ALT
2007
Springer

119views Machine Learning» more ALT 2007»

Pseudometrics for State Aggregation in Average Reward Markov Decision Processes

16 years 4 months ago

Download personal.unileoben.ac.at

We consider how state similarity in average reward Markov decision processes (MDPs) may be described by pseudometrics. Introducing the notion of adequate pseudometrics which are we...

Ronald Ortner

claim paper

Read More »

189

click to vote

ICRA
2002
IEEE

133views Robotics» more ICRA 2002»

The Necessity of Average Rewards in Cooperative Multirobot Learning

16 years 53 min ago

Download www.ri.cmu.edu

Learning can be an effective way for robot systems to deal with dynamic environments and changing task conditions. However, popular singlerobot learning algorithms based on discou...

Poj Tangamchit, John M. Dolan, Pradeep K. Khosla

claim paper

Read More »

267

click to vote

AI
1998
Springer

177views Artificial Intelligence» more AI 1998»

Model-Based Average Reward Reinforcement Learning

15 years 6 months ago

Download web.engr.oregonstate.edu

Reinforcement Learning (RL) is the study of programs that improve their performance by receiving rewards and punishments from the environment. Most RL methods optimize the discoun...

Prasad Tadepalli, DoKyeong Ok

claim paper

Read More »

193

click to vote

ICML
2002
IEEE

146views Machine Learning» more ICML 2002»

Hierarchically Optimal Average Reward Reinforcement Learning

16 years 7 months ago

Download www.cs.ualberta.ca

Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

« Prev « First page 2 / 4 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers