Sciweavers

185 search results - page 30 / 37
» Simulation-Based Optimization Algorithms for Finite-Horizon ...
Sort
View
AAAI
2010
15 years 5 months ago
Symbolic Dynamic Programming for First-order POMDPs
Partially-observable Markov decision processes (POMDPs) provide a powerful model for sequential decision-making problems with partially-observed state and are known to have (appro...
Scott Sanner, Kristian Kersting
122
Voted
NIPS
2007
15 years 5 months ago
Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Ambuj Tewari, Peter L. Bartlett
124
Voted
NIPS
2001
15 years 5 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
VTC
2008
IEEE
152views Communications» more  VTC 2008»
15 years 10 months ago
Network Controlled Joint Radio Resource Management for Heterogeneous Networks
Abstract— In this paper, we propose a way of achieving optimality in radio resource management (RRM) for heterogeneous networks. We consider a micro or femto cell with two co-loc...
Marceau Coupechoux, Jean Marc Kelif, Philippe Godl...
127
Voted
ATAL
2007
Springer
15 years 9 months ago
Combinatorial resource scheduling for multiagent MDPs
Optimal resource scheduling in multiagent systems is a computationally challenging task, particularly when the values of resources are not additive. We consider the combinatorial ...
Dmitri A. Dolgov, Michael R. James, Michael E. Sam...