In timed, zero-sum games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermedi...
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
Abstract McCarthy’s Situation Calculus is arguably the oldest special-purpose knowledge representation formalism, designed to axiomatize knowledge of actions and their effects. ...
Current evaluation functions for heuristic planning are expensive to compute. In numerous planning problems these functions provide good guidance to the solution, so they are wort...
The results of the latest International Probabilistic Planning Competition (IPPC-2008) indicate that the presence of dead ends, states with no trajectory to the goal, makes MDPs h...