Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...
Abstract. Most complete answer set solvers are based on DPLL. One of the constraint propagation methods is the so-called lookahead, which has been somewhat controversial, due to it...
Abstract--Recently Kutin and Niyogi investigated several notions of algorithmic stability--a property of a learning map conceptually similar to continuity--showing that training-st...
This paper introduces a new approach to classification which combines pairwise decomposition techniques with ideas and tools from fuzzy preference modeling. More specifically, our...
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...