In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
We study functional and multivalued dependencies over SQL tables with NOT NULL constraints. Under a no-information interpretation of null values we develop tools for reasoning. We...
We consider the task of reinforcement learning in an environment in which rare significant events occur independently of the actions selected by the controlling agent. If these ev...
We consider the classical finite-state discounted Markovian decision problem, and we introduce a new policy iteration-like algorithm for finding the optimal state costs or Q-facto...
Embodying robot morphologies evolved in simulation can present serious problems for an engineer when translating simplified simulated mechanisms into working devices, often drawing...