We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
It is a standard result in the theory of quantum error-correcting codes that no code of length n can fix more than n/4 arbitrary errors, regardless of the dimension of the coding ...
—Aspnes et al [2] introduced an innovative game for modeling the containment of the spread of viruses and worms (security breaches) in a network. In this model, nodes choose to i...
V. S. Anil Kumar, Rajmohan Rajaraman, Zhifeng Sun,...
Algorithms for determining quality/cost/price tradeoffs in saturated markets are consid-3 ered. A product is modeled by d real-valued qualities whose sum determines the unit cost ...
Joachim Gudmundsson, Pat Morin, Michiel H. M. Smid
We develop multiattribute auctions that accommodate generalized additive independent (GAI) preferences. We propose an iterative auction mechanism that maintains prices on potentia...