Sciweavers

2990 search results - page 592 / 598
» Hidden Markov processes
Sort
View
ATAL
2010
Springer
13 years 7 months ago
Combining manual feedback with subsequent MDP reward signals for reinforcement learning
As learning agents move from research labs to the real world, it is increasingly important that human users, including those without programming skills, be able to teach agents de...
W. Bradley Knox, Peter Stone
CORR
2008
Springer
122views Education» more  CORR 2008»
13 years 6 months ago
Strategy Improvement for Concurrent Safety Games
We consider concurrent games played on graphs. At every round of the game, each player simultaneously and independently selects a move; the moves jointly determine the transition ...
Krishnendu Chatterjee, Luca de Alfaro, Thomas A. H...
CSL
2010
Springer
13 years 6 months ago
Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems
This paper describes a statistically motivated framework for performing real-time dialogue state updates and policy learning in a spoken dialogue system. The framework is based on...
Blaise Thomson, Steve Young
CORR
2008
Springer
173views Education» more  CORR 2008»
13 years 6 months ago
Decomposition Principles and Online Learning in Cross-Layer Optimization for Delay-Sensitive Applications
In this paper, we propose a general cross-layer optimization framework in which we explicitly consider both the heterogeneous and dynamically changing characteristics of delay-sens...
Fangwen Fu, Mihaela van der Schaar
AI
2006
Springer
13 years 6 months ago
Backward-chaining evolutionary algorithms
Starting from some simple observations on a popular selection method in Evolutionary Algorithms (EAs)--tournament selection--we highlight a previously-unknown source of inefficien...
Riccardo Poli, William B. Langdon