Search Sciweavers | Sciweavers

247

MIRRORBOT
2005
Springer

179views Robotics» more MIRRORBOT 2005»

Learning to Interpret Pointing Gestures: Experiments with Four-Legged Autonomous Robots

16 years 10 days ago

This paper explores the hypothesis that pointing gesture recognition can be learned using a reward based system. An experiment with two four-legged robots is presented. One of the...

Verena Vanessa Hafner, Frédéric Kapl...

claim paper

Read More »

160

click to vote

AUSAI
1999
Springer

118views Artificial Intelligence» more AUSAI 1999»

Q-Learning in Continuous State and Action Spaces

15 years 11 months ago

Download users.cecs.anu.edu.au

Abstract. Q-learning can be used to learn a control policy that maximises a scalar reward through interaction with the environment. Qlearning is commonly applied to problems with d...

Chris Gaskett, David Wettergreen, Alexander Zelins...

claim paper

Read More »

180

click to vote

NECO
2010

97views more NECO 2010»

Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning

15 years 5 months ago

Download www.kyb.tuebingen.mpg.de

Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...

Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...

claim paper

Read More »

222

click to vote

WPES
2003
ACM

148views Security Privacy» more WPES 2003»

Policy migration for sensitive credentials in trust negotiation

16 years 3 days ago

Download www4.ncsu.edu

Trust negotiation is an approach to establishing trust between strangers through the bilateral, iterative disclosure of digital credentials. Under automated trust negotiation, acc...

Ting Yu, Marianne Winslett

claim paper

Read More »

194

click to vote

AAAI
2007

117views Intelligent Agents» more AAAI 2007»

Authorial Idioms for Target Distributions in TTD-MDPs

15 years 9 months ago

Download www.cc.gatech.edu

In designing Markov Decision Processes (MDP), one must deﬁne the world, its dynamics, a set of actions, and a reward function. MDPs are often applied in situations where there i...

David L. Roberts, Sooraj Bhat, Kenneth St. Clair, ...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers