Search Sciweavers | Sciweavers

87 search results - page 10 / 18

» Hybrid Least-Squares Algorithms for Approximate Policy Evalu...

click to vote

ICML
2008
IEEE

165views Machine Learning» more ICML 2008»

A worst-case comparison between temporal difference and residual gradient with linear function approximation

14 years 8 months ago

Download www.research.rutgers.edu

Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...

Lihong Li

claim paper

Read More »

click to vote

ATAL
2009
Springer

198views Intelligent Agents» more ATAL 2009»

SarsaLandmark: an algorithm for learning in POMDPs with landmarks

14 years 2 months ago

Download www.aamas-conference.org

Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...

Michael R. James, Satinder P. Singh

claim paper

Read More »

click to vote

SIGMETRICS
2005
ACM

118views Hardware» more SIGMETRICS 2005»

Nearly insensitive bounds on SMART scheduling

14 years 1 months ago

Download www.cs.cmu.edu

We deﬁne the class of SMART scheduling policies. These are policies that bias towards jobs with small remaining service times, jobs with small original sizes, or both, with the ...

Adam Wierman, Mor Harchol-Balter, Takayuki Osogami

claim paper

Read More »

click to vote

ICRA
2010
IEEE

163views Robotics» more ICRA 2010»

Exploiting domain knowledge in planning for uncertain robot systems modeled as POMDPs

13 years 6 months ago

Download robotics.ai.uiuc.edu

Abstract— We propose a planning algorithm that allows usersupplied domain knowledge to be exploited in the synthesis of information feedback policies for systems modeled as parti...

Salvatore Candido, James C. Davidson, Seth Hutchin...

claim paper

Read More »

click to vote

RSS
2007

176views Robotics» more RSS 2007»

Active Policy Learning for Robot Planning and Exploration under Uncertainty

13 years 9 months ago

Download www.roboticsproceedings.org

Abstract— This paper proposes a simulation-based active policy learning algorithm for ﬁnite-horizon, partially-observed sequential decision processes. The algorithm is tested i...

Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...

claim paper

Read More »

« Prev « First page 10 / 18 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers