Sciweavers

260 search results - page 49 / 52
» Quasi-Deterministic Partially Observable Markov Decision Pro...
Sort
View
CPAIOR
2008
Springer
13 years 9 months ago
Amsaa: A Multistep Anticipatory Algorithm for Online Stochastic Combinatorial Optimization
The one-step anticipatory algorithm (1s-AA) is an online algorithm making decisions under uncertainty by ignoring future non-anticipativity constraints. It makes near-optimal decis...
Luc Mercier, Pascal Van Hentenryck
ICML
1999
IEEE
14 years 8 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
DEXA
2003
Springer
147views Database» more  DEXA 2003»
14 years 24 days ago
Context-Aware Data Mining Framework for Wireless Medical Application
Abstract. Data mining, which aims at extracting interesting information from large collections of data, has been widely used as an effective decision making tool. Mining the datas...
Pravin Vajirkar, Sachin Singh, Yugyung Lee
ICML
2007
IEEE
14 years 8 months ago
Multi-task reinforcement learning: a hierarchical Bayesian approach
We consider the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes (MDPs) chosen randomly from a fixed but unknow...
Aaron Wilson, Alan Fern, Soumya Ray, Prasad Tadepa...
ICML
2004
IEEE
14 years 8 months ago
Apprenticeship learning via inverse reinforcement learning
We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we wa...
Pieter Abbeel, Andrew Y. Ng