Abstract. In order to establish autonomous behavior for technical systems, the well known trade-off between reactive control and deliberative planning has to be considered. Within ...
We present an efficient "sparse sampling" technique for approximating Bayes optimal decision making in reinforcement learning, addressing the well known exploration vers...
Tao Wang, Daniel J. Lizotte, Michael H. Bowling, D...
The learning classifier system XCS is an iterative rulelearning system that evolves rule structures based on gradient-based prediction and rule quality estimates. Besides classifi...
Martin V. Butz, Pier Luca Lanzi, Stewart W. Wilson
Markov decision processes (MDPs) with discrete and continuous state and action components can be solved efficiently by hybrid approximate linear programming (HALP). The main idea ...
Abstract— The optimal model parameters of a kernel machine are typically given by the solution of a convex optimisation problem with a single global optimum. Obtaining the best p...