Search Sciweavers | Sciweavers

32

ICONIP
2007

147views Information Technology» more ICONIP 2007»

Finding Exploratory Rewards by Embodied Evolution and Constrained Reinforcement Learning in the Cyber Rodents

13 years 11 months ago

The aim of the Cyber Rodent project [1] is to elucidate the origin of our reward and aﬀective systems by building artiﬁcial agents that share the natural biological constraints...

Eiji Uchibe, Kenji Doya

claim paper

Read More »

27

click to vote

IJCAI
2007

143views Artificial Intelligence» more IJCAI 2007»

Direct Code Access in Self-Organizing Neural Networks for Reinforcement Learning

13 years 11 months ago

Download www.aaai.org

TD-FALCON is a self-organizing neural network that incorporates Temporal Difference (TD) methods for reinforcement learning. Despite the advantages of fast and stable learning, TD...

Ah-Hwee Tan

claim paper

Read More »

35

click to vote

EACL
2006
ACL Anthology

143views Natural Language Processing» more EACL 2006»

Using Reinforcement Learning to Build a Better Model of Dialogue State

13 years 11 months ago

Download acl.ldc.upenn.edu

Given the growing complexity of tasks that spoken dialogue systems are trying to handle, Reinforcement Learning (RL) has been increasingly used as a way of automatically learning ...

Joel R. Tetreault, Diane J. Litman

claim paper

Read More »

26

click to vote

NIPS
2004

93views Information Technology» more NIPS 2004»

Intrinsically Motivated Reinforcement Learning

13 years 11 months ago

Download books.nips.cc

Psychologists call behavior intrinsically motivated when it is engaged in for its own sake rather than as a step toward solving a specific problem of clear practical value. But wh...

Satinder P. Singh, Andrew G. Barto, Nuttapong Chen...

claim paper

Read More »

26

click to vote

NIPS
2000

127views Information Technology» more NIPS 2000»

Using Free Energies to Represent Q-values in a Multiagent Reinforcement Learning Task

13 years 11 months ago

Download members.chello.at

The problem of reinforcement learning in large factored Markov decision processes is explored. The Q-value of a state-action pair is approximated by the free energy of a product o...

Brian Sallans, Geoffrey E. Hinton

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers