Search Sciweavers | Sciweavers

116

Voted

ICRA
2009
IEEE

143views Robotics» more ICRA 2009»

Least absolute policy iteration for robust value function approximation

15 years 8 months ago

Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efﬁciency. However, it tends to be sensitive to outliers...

Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...

claim paper

Read More »

114

click to vote

BMCV
2000
Springer

170views Computer Vision» more BMCV 2000»

Unsupervised Learning of Biologically Plausible Object Recognition Strategies

15 years 6 months ago

Download www.cs.colostate.edu

Recent psychological and neurological evidence suggests that biological object recognition is a process of matching sensed images to stored iconic memories. This paper presents a p...

Bruce A. Draper, Kyungim Baek

claim paper

Read More »

79

Voted

ICALT
2007
IEEE

84views Machine Learning» more ICALT 2007»

Evaluating the automatic and manual creation process of adaptive lessons

15 years 4 months ago

Download prolearn.dcs.warwick.ac.uk

Using adaptive, personalized courses is rewarding, as it can create a better learning experience, tailored for a specific learner’s needs. The process of creating these courses,...

Maurice Hendrix, Alexandra I. Cristea, Mike Joy

claim paper

Read More »

134

Voted

AAMAS
2002
Springer

130views Intelligent Agents» more AAMAS 2002»

Relational Reinforcement Learning for Agents in Worlds with Objects

15 years 2 months ago

Download www-ai.ijs.si

In reinforcement learning, an agent tries to learn a policy, i.e., how to select an action in a given state of the environment, so that it maximizes the total amount of reward it ...

Saso Dzeroski

claim paper

Read More »

112

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 3 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers