Search Sciweavers | Sciweavers

267 search results - page 36 / 54

» Qualitative Analysis of Partially-Observable Markov Decision...

157

click to vote

AIPS
2008

155views Artificial Intelligence» more AIPS 2008»

HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot

15 years 8 months ago

Download www.cs.bham.ac.uk

Flexible general purpose robots need to tailor their visual processing to their task, on the fly. We propose a new approach to this within a planning framework, where the goal is ...

Mohan Sridharan, Jeremy L. Wyatt, Richard Dearden

claim paper

Read More »

248

click to vote

CSL
2012
Springer

311views Automated Reasoning» more CSL 2012»

Reinforcement learning for parameter estimation in statistical spoken dialogue systems

14 years 1 months ago

Download mi.eng.cam.ac.uk

Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...

Filip Jurcícek, Blaise Thomson, Steve Young

claim paper

Read More »

216

click to vote

VTC
2008
IEEE

185views Communications» more VTC 2008»

Opportunistic Spectrum Access for Energy-Constrained Cognitive Radios

16 years 10 days ago

Download www1.i2r.a-star.edu.sg

This paper considers a scenario in which a secondary user makes opportunistic use of a channel allocated to some primary network. The primary network operates in a time-slotted ma...

Anh Tuan Hoang, Ying-Chang Liang, David Tung Chong...

claim paper

Read More »

152

click to vote

ICML
2005
IEEE

133views Machine Learning» more ICML 2005»

A theoretical analysis of Model-Based Interval Estimation

16 years 6 months ago

Download paul.rutgers.edu

Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Inter...

Alexander L. Strehl, Michael L. Littman

claim paper

Read More »

157

click to vote

ICML
2010
IEEE

219views Machine Learning» more ICML 2010»

Convergence of Least Squares Temporal Difference Methods Under General Conditions

15 years 7 months ago

Download www.cs.helsinki.fi

We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...

Huizhen Yu

claim paper

Read More »

« Prev « First page 36 / 54 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers