Sciweavers

473 search results - page 63 / 95
» Optimal policy switching algorithms for reinforcement learni...
Sort
View
JMLR
2006
124views more  JMLR 2006»
13 years 9 months ago
Policy Gradient in Continuous Time
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Rémi Munos
CORR
2012
Springer
216views Education» more  CORR 2012»
12 years 4 months ago
Fractional Moments on Bandit Problems
Reinforcement learning addresses the dilemma between exploration to find profitable actions and exploitation to act according to the best observations already made. Bandit proble...
Ananda Narayanan B., Balaraman Ravindran
DSP
2006
13 years 9 months ago
Adaptive multi-modality sensor scheduling for detection and tracking of smart targets
This paper considers the problem of sensor scheduling for the purposes of detection and tracking of "smart" targets. Smart targets are targets that can detect when they ...
Christopher M. Kreucher, Doron Blatt, Alfred O. He...
AUSAI
2005
Springer
14 years 2 months ago
Adaptive Utility-Based Scheduling in Resource-Constrained Systems
This paper addresses the problem of scheduling jobs in soft real-time systems, where the utility of completing each job decreases over time. We present a utility-based framework fo...
David Vengerov
NIPS
1993
13 years 10 months ago
Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming
Dynamic programming provides a methodology to develop planners and controllers for nonlinear systems. However, general dynamic programming is computationally intractable. We have ...
Christopher G. Atkeson