Sciweavers

CORR
2008
Springer
136views Education» more  CORR 2008»
14 years 17 days ago
Multi-Armed Bandits in Metric Spaces
In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of n trials so as to maximize the total payoff of the chosen strategies. While ...
Robert Kleinberg, Aleksandrs Slivkins, Eli Upfal
CEC
2005
IEEE
14 years 6 months ago
XCS with computed prediction for the learning of Boolean functions
Computed prediction represents a major shift in learning classifier system research. XCS with computed prediction, based on linear approximators, has been applied so far to functi...
Pier Luca Lanzi, Daniele Loiacono, Stewart W. Wils...
LICS
2007
IEEE
14 years 6 months ago
Limits of Multi-Discounted Markov Decision Processes
Markov decision processes (MDPs) are controllable discrete event systems with stochastic transitions. The payoff received by the controller can be evaluated in different ways, dep...
Hugo Gimbert, Wieslaw Zielonka