Search Sciweavers | Sciweavers

802 search results - page 158 / 161

» Experts in a Markov Decision Process

189

click to vote

NIPS
1998

137views Information Technology» more NIPS 1998»

Risk Sensitive Reinforcement Learning

15 years 8 months ago

Download www.cs.cmu.edu

In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with re...

Ralph Neuneier, Oliver Mihatsch

claim paper

Read More »

223

Voted

NIPS
1996

192views Information Technology» more NIPS 1996»

Multidimensional Triangulation and Interpolation for Reinforcement Learning

15 years 8 months ago

Download www.cs.cmu.edu

Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...

Scott Davies

claim paper

Read More »

246

click to vote

ATAL
2010
Springer

158views Intelligent Agents» more ATAL 2010»

Combining manual feedback with subsequent MDP reward signals for reinforcement learning

15 years 8 months ago

Download www.cs.utexas.edu

As learning agents move from research labs to the real world, it is increasingly important that human users, including those without programming skills, be able to teach agents de...

W. Bradley Knox, Peter Stone

claim paper

Read More »

190

click to vote

CORR
2008
Springer

122views Education» more CORR 2008»

Strategy Improvement for Concurrent Safety Games

15 years 7 months ago

Download www.soe.ucsc.edu

We consider concurrent games played on graphs. At every round of the game, each player simultaneously and independently selects a move; the moves jointly determine the transition ...

Krishnendu Chatterjee, Luca de Alfaro, Thomas A. H...

claim paper

Read More »

234

Voted

CSL
2010
Springer

238views Automated Reasoning» more CSL 2010»

Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems

15 years 7 months ago

Download mi.eng.cam.ac.uk

This paper describes a statistically motivated framework for performing real-time dialogue state updates and policy learning in a spoken dialogue system. The framework is based on...

Blaise Thomson, Steve Young

claim paper

Read More »

« Prev « First page 158 / 161 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers