Sciweavers

1912 search results - page 155 / 383
» Optimizing interconnection policies
Sort
View
128
Voted
NIPS
2001
15 years 5 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
129
Voted
JMLR
2006
143views more  JMLR 2006»
15 years 3 months ago
Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...
Rémi Munos
118
Voted
TON
2008
95views more  TON 2008»
15 years 3 months ago
Integration of explicit effective-bandwidth-based QoS routing with best-effort routing
This paper presents a methodology for protecting low-priority best-effort (BE) traffic in a network domain that provides both virtual-circuit routing with bandwidth reservation for...
Stephen L. Spitler, Daniel C. Lee
134
Voted
INFOCOM
2010
IEEE
15 years 2 months ago
Change Management in Enterprise IT Systems: Process Modeling and Capacity-optimal Scheduling
Abstract—We provide a formal model for the Change Management process for Enterprise IT systems, and develop change scheduling algorithms that seek to attain the “change capacit...
Praveen Kumar Muthuswamy, Koushik Kar, Sambit Sahu...
COLT
2010
Springer
15 years 1 months ago
Best Arm Identification in Multi-Armed Bandits
We consider the problem of finding the best arm in a stochastic multi-armed bandit game. The regret of a forecaster is here defined by the gap between the mean reward of the optim...
Jean-Yves Audibert, Sébastien Bubeck, R&eac...