Sciweavers

185 search results - page 37 / 37
» Simulation-Based Optimization Algorithms for Finite-Horizon ...
Sort
View
ICML
1996
IEEE
14 years 8 months ago
Learning Evaluation Functions for Large Acyclic Domains
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
Justin A. Boyan, Andrew W. Moore
NIPS
1998
13 years 9 months ago
Risk Sensitive Reinforcement Learning
In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with re...
Ralph Neuneier, Oliver Mihatsch
HICSS
2003
IEEE
207views Biometrics» more  HICSS 2003»
14 years 27 days ago
Formalizing Multi-Agent POMDP's in the context of network routing
This paper uses partially observable Markov decision processes (POMDP’s) as a basic framework for MultiAgent planning. We distinguish three perspectives: first one is that of a...
Bharaneedharan Rathnasabapathy, Piotr J. Gmytrasie...
CSL
2010
Springer
13 years 7 months ago
Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems
This paper describes a statistically motivated framework for performing real-time dialogue state updates and policy learning in a spoken dialogue system. The framework is based on...
Blaise Thomson, Steve Young
WWW
2005
ACM
14 years 8 months ago
Executing incoherency bounded continuous queries at web data aggregators
Continuous queries are used to monitor changes to time varying data and to provide results useful for online decision making. Typically a user desires to obtain the value of some ...
Rajeev Gupta, Ashish Puri, Krithi Ramamritham