Sciweavers

SIAMCO
2010
107views more  SIAMCO 2010»
13 years 6 months ago
Optimal Control under Stochastic Target Constraints
We study a class of Markovian optimal stochastic control problems in which the controlled process Z is constrained to satisfy an a.s. constraint Z (T) G Rd+1 P - a.s. at some fi...
Bruno Bouchard, Romuald Elie, Cyril Imbert
PKDD
2010
Springer
129views Data Mining» more  PKDD 2010»
13 years 9 months ago
Smarter Sampling in Model-Based Bayesian Reinforcement Learning
Abstract. Bayesian reinforcement learning (RL) is aimed at making more efficient use of data samples, but typically uses significantly more computation. For discrete Markov Decis...
Pablo Samuel Castro, Doina Precup
MOR
2010
91views more  MOR 2010»
13 years 10 months ago
On the One-Dimensional Optimal Switching Problem
We explicitly solve the optimal switching problem for one-dimensional diffusions by directly employing the dynamic programming principle and the excessive characterization of the ...
Erhan Bayraktar, Masahiko Egami
JAIR
2010
108views more  JAIR 2010»
13 years 10 months ago
Kalman Temporal Differences
This paper deals with value (and Q-) function approximation in deterministic Markovian decision processes (MDPs). A general statistical framework based on the Kalman filtering pa...
Matthieu Geist, Olivier Pietquin
INFORMS
2010
90views more  INFORMS 2010»
13 years 10 months ago
Approximate Dynamic Programming for Ambulance Redeployment
We present an approximate dynamic programming approach for making ambulance redeployment decisions in an emergency medical service system. The primary decision is where we should ...
Matthew S. Maxwell, Mateo Restrepo, Shane G. Hende...
QUESTA
1998
124views more  QUESTA 1998»
13 years 11 months ago
Structural results for the control of queueing systems using event-based dynamic programming
In this paper we study monotonicity results for optimal policies of various queueing and resource sharing models. The standard approach is to propagate, for each specific model, ...
Ger Koole
ML
2002
ACM
154views Machine Learning» more  ML 2002»
13 years 11 months ago
Technical Update: Least-Squares Temporal Difference Learning
TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...
Justin A. Boyan
AI
1998
Springer
13 years 11 months ago
Model-Based Average Reward Reinforcement Learning
Reinforcement Learning (RL) is the study of programs that improve their performance by receiving rewards and punishments from the environment. Most RL methods optimize the discoun...
Prasad Tadepalli, DoKyeong Ok
TSMC
2008
132views more  TSMC 2008»
13 years 11 months ago
Ensemble Algorithms in Reinforcement Learning
This paper describes several ensemble methods that combine multiple different reinforcement learning (RL) algorithms in a single agent. The aim is to enhance learning speed and fin...
Marco A. Wiering, Hado van Hasselt
SIAMCO
2008
121views more  SIAMCO 2008»
13 years 11 months ago
A Direct Solution Method for Stochastic Impulse Control Problems of One-dimensional Diffusions
We consider stochastic impulse control problems where the process is driven by one-dimensional diffusions. Impulse control problems are widely applied to financial engineering and...
Masahiko Egami