Sciweavers

263 search results - page 11 / 53
» Regret Bounds for Prediction Problems
Sort
View
INFOCOM
2010
IEEE
13 years 8 months ago
Opportunistic Spectrum Access with Multiple Users: Learning under Competition
Abstract—The problem of cooperative allocation among multiple secondary users to maximize cognitive system throughput is considered. The channel availability statistics are initi...
Animashree Anandkumar, Nithin Michael, Ao Tang
CORR
2010
Springer
116views Education» more  CORR 2010»
13 years 10 months ago
Adaptive Bound Optimization for Online Convex Optimization
We introduce a new online convex optimization algorithm that adaptively chooses its regularization function based on the loss functions observed so far. This is in contrast to pre...
H. Brendan McMahan, Matthew J. Streeter
NIPS
2007
13 years 11 months ago
The Price of Bandit Information for Online Optimization
In the online linear optimization problem, a learner must choose, in each round, a decision from a set D ⊂ Rn in order to minimize an (unknown and changing) linear cost function...
Varsha Dani, Thomas P. Hayes, Sham Kakade
CORR
2010
Springer
105views Education» more  CORR 2010»
13 years 8 months ago
Optimism in Reinforcement Learning Based on Kullback-Leibler Divergence
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
Sarah Filippi, Olivier Cappé, Aurelien Gari...
IOR
2011
96views more  IOR 2011»
13 years 4 months ago
On the Minimax Complexity of Pricing in a Changing Environment
We consider a pricing problem in an environment where the customers’ willingness-to-pay (WtP) distribution may change at some point over the selling horizon. Customers arrive se...
Omar Besbes, Assaf J. Zeevi