One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...
Exotic semirings such as the “(max, +) semiring” (R ∪ {−∞}, max, +), or the “tropical semiring” (N ∪ {+∞}, min, +), have been invented and reinvented many times s...
We are interested in contributing to solving effectively a particular type of real-time stochastic resource allocation problem. Firstly, one distinction is that certain tasks may c...
Decentralized partially observable Markov decision process (DEC-POMDP) is an approach to model multi-robot decision making problems under uncertainty. Since it is NEXP-complete the...
A quantum device simulating human decision making process is introduced. It consists of quantum recurrent nets generating stochastic processes which represent the motor dynamics, ...