Regret Minimization Under Partial Monitoring

13 years 11 months ago

Download eprints.pascal-network.org

We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan consistent players for this games; that is, randomized playing strategies whose per-round regret vanishes with probability one as the number n of game rounds goes to infinity. We prove a general lower bound of (n-1/3) on the convergence rate of the regret, and exhibit a specific strategy that attains this rate on any game for which a Hannan consistent player exists. The first two authors acknowledge support by the PASCAL Network of Excellence under EC grant no. 506778. The work of the second author was supported by the Spanish Ministry of Science and Technology and FEDER, grant BMF2003-03324. Part of this work was done wile the third co-author was visiting Pompeu Fabra University. 1

Nicolò Cesa-Bianchi, Gábor Lugosi, G

Real-time Traffic

Game | General Lower Bound | Hannan Consistent Player | MOR 2006 |

claim paper

Post Info
More Details (n/a)

Added	14 Dec 2010
Updated	14 Dec 2010
Type	Journal
Year	2006
Where	MOR
Authors	Nicolò Cesa-Bianchi, Gábor Lugosi, Gilles Stoltz

Comments (0)

Sciweavers

Regret Minimization Under Partial Monitoring

Game | General Lower Bound | Hannan Consistent Player | MOR 2006 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers