Search Sciweavers | Sciweavers

260 search results - page 28 / 52

» Quasi-Deterministic Partially Observable Markov Decision Pro...

119

click to vote

SIGECOM
2009
ACM

114views ECommerce» more SIGECOM 2009»

Policy teaching through reward function learning

15 years 11 months ago

Download www.eecs.harvard.edu

Policy teaching considers a Markov Decision Process setting in which an interested party aims to inﬂuence an agent’s decisions by providing limited incentives. In this paper, ...

Haoqi Zhang, David C. Parkes, Yiling Chen

claim paper

Read More »

146

Voted

IJCAI
2003

142views Artificial Intelligence» more IJCAI 2003»

Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings

15 years 5 months ago

Download dli.iiit.ac.in

The problem of deriving joint policies for a group of agents that maximize some joint reward function can be modeled as a decentralized partially observable Markov decision proces...

Ranjit Nair, Milind Tambe, Makoto Yokoo, David V. ...

claim paper

Read More »

147

click to vote

JAIR
2006

160views more JAIR 2006»

Anytime Point-Based Approximations for Large POMDPs

15 years 4 months ago

Download www.jair.org

The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact s...

Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun

claim paper

Read More »

213

click to vote

CSL
2012
Springer

311views Automated Reasoning» more CSL 2012»

Reinforcement learning for parameter estimation in statistical spoken dialogue systems

14 years 3 days ago

Download mi.eng.cam.ac.uk

Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...

Filip Jurcícek, Blaise Thomson, Steve Young

claim paper

Read More »

114

click to vote

ATAL
2009
Springer

198views Intelligent Agents» more ATAL 2009»

SarsaLandmark: an algorithm for learning in POMDPs with landmarks

15 years 11 months ago

Download www.aamas-conference.org

Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...

Michael R. James, Satinder P. Singh

claim paper

Read More »

« Prev « First page 28 / 52 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers