Sciweavers

267 search results - page 36 / 54
» Qualitative Analysis of Partially-Observable Markov Decision...
Sort
View
AIPS
2008
13 years 10 months ago
HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot
Flexible general purpose robots need to tailor their visual processing to their task, on the fly. We propose a new approach to this within a planning framework, where the goal is ...
Mohan Sridharan, Jeremy L. Wyatt, Richard Dearden
CSL
2012
Springer
12 years 3 months ago
Reinforcement learning for parameter estimation in statistical spoken dialogue systems
Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...
Filip Jurcícek, Blaise Thomson, Steve Young
VTC
2008
IEEE
185views Communications» more  VTC 2008»
14 years 2 months ago
Opportunistic Spectrum Access for Energy-Constrained Cognitive Radios
This paper considers a scenario in which a secondary user makes opportunistic use of a channel allocated to some primary network. The primary network operates in a time-slotted ma...
Anh Tuan Hoang, Ying-Chang Liang, David Tung Chong...
ICML
2005
IEEE
14 years 8 months ago
A theoretical analysis of Model-Based Interval Estimation
Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Inter...
Alexander L. Strehl, Michael L. Littman
ICML
2010
IEEE
13 years 8 months ago
Convergence of Least Squares Temporal Difference Methods Under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
Huizhen Yu