Partially Observable Markov Decision Processes (POMDPs) provide a general framework for AI planning, but they lack the structure for representing real world planning problems in a...
The goal of Reinforcement learning (RL) is to maximize reward (minimize cost) in a Markov decision process (MDP) without knowing the underlying model a priori. RL algorithms tend ...
A key problem in reinforcement learning is finding a good balance between the need to explore the environment and the need to gain rewards by exploiting existing knowledge. Much ...
Abstract. Taking neuromodulation as a mechanism underlying emotions, this paper investigates how such a mechanism can bias an artificial neural network towards exploration of new ...
Probabilistic timed automata, a variant of timed automata extended with discrete probability distributions, is a specification formalism suitable for describing both nondeterminis...
Marta Z. Kwiatkowska, Gethin Norman, David Parker,...