It is known that the complexity of the reinforcement learning algorithms, such as Q-learning, may be exponential in the number of environment’s states. It was shown, however, th...
Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...
Abstract. Over the years, various research projects have attempted to develop a chess program that learns to play well given little prior knowledge beyond the rules of the game. Ea...
This paper introduces a gradient-based reward prediction update mechanism to the XCS classifier system as applied in neuralnetwork type learning and function approximation mechani...
Martin V. Butz, David E. Goldberg, Pier Luca Lanzi
Planning problems are often formulated as heuristic search. The choice of the heuristic function plays a significant role in the performance of planning systems, but a good heuris...