Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on th...
This paper analyzes the complexity of on-line reinforcement learning algorithms, namely asynchronous realtime versions of Q-learning and value-iteration, applied to the problem of...
We consider the problem of multi-task reinforcement learning where the learner is provided with a set of tasks, for which only a small number of samples can be generated for any g...
Many real-world problems are multi-objective optimization problems and evolutionary algorithms are quite successful on such problems. Since the task is to compute or approximate t...
We extend the alternating-time temporal logics ATL and ATL with strategy contexts and memory constraints: the first extension makes strategy quantifiers to not “forget” the s...
Thomas Brihaye, Arnaud Da Costa Lopes, Franç...