Search Sciweavers | Sciweavers

87 search results - page 11 / 18

» A policy iteration algorithm for Markov decision processes s...

click to vote

FOCS
2007
IEEE

157views Theoretical Computer Science» more FOCS 2007»

Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards

14 years 1 months ago

Download www.cis.upenn.edu

We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...

Sudipto Guha, Kamesh Munagala

claim paper

Read More »

click to vote

AIPS
2007

174views Artificial Intelligence» more AIPS 2007»

Learning to Plan Using Harmonic Analysis of Diffusion Models

13 years 9 months ago

Download www.cs.umass.edu

This paper summarizes research on a new emerging framework for learning to plan using the Markov decision process model (MDP). In this paradigm, two approaches to learning to plan...

Sridhar Mahadevan, Sarah Osentoski, Jeffrey Johns,...

claim paper

Read More »

click to vote

ATAL
2009
Springer

198views Intelligent Agents» more ATAL 2009»

SarsaLandmark: an algorithm for learning in POMDPs with landmarks

14 years 2 months ago

Download www.aamas-conference.org

Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...

Michael R. James, Satinder P. Singh

claim paper

Read More »

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

14 years 8 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

click to vote

GECCO
2005
Springer

152views Optimization» more GECCO 2005»

GAMM: genetic algorithms with meta-models for vision

14 years 1 months ago

Download www.cs.bham.ac.uk

Recent adaptive image interpretation systems can reach optimal performance for a given domain via machine learning, without human intervention. The policies are learned over an ex...

Greg Lee, Vadim Bulitko

claim paper

Read More »

« Prev « First page 11 / 18 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers