Sciweavers

95 search results - page 17 / 19
» Policy Gradients for Cryptanalysis
Sort
View
NIPS
2003
13 years 9 months ago
Extending Q-Learning to General Adaptive Multi-Agent Systems
Recent multi-agent extensions of Q-Learning require knowledge of other agents’ payoffs and Q-functions, and assume game-theoretic play at all times by all other agents. This pap...
Gerald Tesauro
ICONIP
2007
13 years 9 months ago
Finding Exploratory Rewards by Embodied Evolution and Constrained Reinforcement Learning in the Cyber Rodents
The aim of the Cyber Rodent project [1] is to elucidate the origin of our reward and affective systems by building artificial agents that share the natural biological constraints...
Eiji Uchibe, Kenji Doya
SIGDIAL
2010
13 years 5 months ago
Modeling Spoken Decision Making Dialogue and Optimization of its Dialogue Strategy
This paper presents a spoken dialogue framework that helps users in making decisions. Users often do not have a definite goal or criteria for selecting from a list of alternatives...
Teruhisa Misu, Komei Sugiura, Kiyonori Ohtake, Chi...
ACL
2009
13 years 5 months ago
Reinforcement Learning for Mapping Instructions to Actions
In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function tha...
S. R. K. Branavan, Harr Chen, Luke S. Zettlemoyer,...
EDBT
2008
ACM
144views Database» more  EDBT 2008»
14 years 7 months ago
BI batch manager: a system for managing batch workloads on enterprise data-warehouses
Modern enterprise data warehouses have complex workloads that are notoriously difficult to manage. An important problem in workload management is to run these complex workloads `o...
Abhay Mehta, Chetan Gupta, Umeshwar Dayal