We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions. The task for an age...
The mean first passage time (MFPT) is calculated for a Brownian particle in a bounded two-dimensional domain that contains N small nonoverlapping absorbing windows on its boundary....
S. Pillay, Michael J. Ward, A. Peirce, Theodore Ko...
Reinforcement learning promises a generic method for adapting agents to arbitrary tasks in arbitrary stochastic environments, but applying it to new real-world problems remains di...
There is much experimental evidence that network traffic processes exhibit ubiquitous properties of self-similarity and long-range dependence, i.e., of correlations over a wide ran...
This paper describes a computationally feasible approximation to the AIXI agent, a universal reinforcement learning agent for arbitrary environments. AIXI is scaled down in two ke...
Joel Veness, Kee Siong Ng, Marcus Hutter, William ...