Search Sciweavers | Sciweavers

102 search results - page 10 / 21

» MDPs with Non-Deterministic Policies

click to vote

ACL
2000

141views Computational Linguistics» more ACL 2000»

Spoken Dialogue Management Using Probabilistic Reasoning

13 years 8 months ago

Download web.mit.edu

Spoken dialogue managers have benefited from using stochastic planners such as Markov Decision Processes (MDPs). However, so far, MDPs do not handle well noisy and ambiguous speec...

Nicholas Roy, Joelle Pineau, Sebastian Thrun

claim paper

Read More »

click to vote

AAAI
1997

119views Intelligent Agents» more AAAI 1997»

Structured Solution Methods for Non-Markovian Decision Processes

13 years 8 months ago

Download www.cs.toronto.edu

Markov Decision Processes (MDPs), currently a popular method for modeling and solving decision theoretic planning problems, are limited by the Markovian assumption: rewards and dy...

Fahiem Bacchus, Craig Boutilier, Adam J. Grove

claim paper

Read More »

click to vote

ICML
2008
IEEE

105views Machine Learning» more ICML 2008»

Learning all optimal policies with multiple criteria

14 years 8 months ago

Download leon.barrettnexus.com

We describe an algorithm for learning in the presence of multiple criteria. Our technique generalizes previous approaches in that it can learn optimal policies for all linear pref...

Leon Barrett, Srini Narayanan

claim paper

Read More »

click to vote

AAAI
2006

94views Intelligent Agents» more AAAI 2006»

Factored MDP Elicitation and Plan Display

13 years 9 months ago

Download www.aaai.org

The software suite we will demonstrate at AAAI '06 was designed around planning with factored Markov decision processes (MDPs). It is a user-friendly suite that facilitates d...

Krol Kevin Mathias, Casey Lengacher, Derek William...

claim paper

Read More »

click to vote

ML
2002
ACM

154views Machine Learning» more ML 2002»

Technical Update: Least-Squares Temporal Difference Learning

13 years 7 months ago

Download www.research.rutgers.edu

TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...

Justin A. Boyan

claim paper

Read More »

« Prev « First page 10 / 21 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers