Abstract. One reason for wanting to compute an (approximate) Nash equilibrium of a game is to predict how players will play. However, if the game has multiple equilibria that are f...
We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
Service coordination in domains involving temporal constraints and duration uncertainty has previously been solved with a greedy algorithm that attempts to satisfy service requests...
Morris, Muscettola and Vidal (MMV) presented an algorithm for checking the dynamic controllability (DC) of temporal networks in which certain temporal durations are beyond the con...
Temporal difference methods are theoretically grounded and empirically effective methods for addressing reinforcement learning problems. In most real-world reinforcement learning ...