This paper investigates how the Univariate Marginal Distribution Algorithm (UMDA) behaves in non-stationary environments when engaging in sampling and selection strategies designe...
Response Surface Methodology (RSM) is a metamodelbased optimization method. Its strategy is to explore small subregions of the parameter space in succession instead of attempting ...
Stochastic games generalize Markov decision processes MDPs to a multiagent setting by allowing the state transitions to depend jointly on all player actions, and having rewards de...
Michael J. Kearns, Yishay Mansour, Satinder P. Sin...
Decentralized reinforcement learning (DRL) has been applied to a number of distributed applications. However, one of the main challenges faced by DRL is its convergence. Previous ...
Chongjie Zhang, Victor R. Lesser, Sherief Abdallah
We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal...