Sciweavers

ORL
2007
70views more  ORL 2007»
13 years 10 months ago
Linear dependence of stationary distributions in ergodic Markov decision processes
In ergodic MDPs we consider stationary distributions of policies that coincide in all but n states, in which one of two possible actions is chosen. We give conditions and formulas...
Ronald Ortner
CORR
2010
Springer
110views Education» more  CORR 2010»
13 years 11 months ago
Mixing Time and Stationary Expected Social Welfare of Logit Dynamics
We study logit dynamics [3] for strategic games. At every stage of the game a player is selected uniformly at random and she is assumed to play according to a noisy best-response ...
Vincenzo Auletta, Diodato Ferraioli, Francesco Pas...
ICML
2010
IEEE
14 years 7 days ago
Finite-Sample Analysis of LSTD
In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...
Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...
WSC
1998
14 years 15 days ago
Stopping Criterion for a Simulation-Based Optimization Method
We consider a new simulation-based optimization method called the Nested Partitions (NP) method. This method generates a Markov chain and solving the optimization problem is equiv...
Sigurdur Ólafsson, Leyuan Shi
DAGSTUHL
2006
14 years 17 days ago
How fast does the stationary distribution of the Markov chain modelling EAs concentrate on the homogeneous populations for small
One of the main difficulties faced when analyzing Markov chains modelling evolutionary algorithms is that their cardinality grows quite fast. A reasonable way to deal with this iss...
Boris Mitavskiy, Jonathan E. Rowe