We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
Abstract--In this paper, the sum capacity of the Gaussian Multiple Input Multiple Output (MIMO) Cognitive Radio Channel (MCC) is expressed as a convex problem with finite number of...
VLIW and EDGE (Explicit Data Graph Execution) architectures rely on compilers to form high-quality hyperblocks for good performance. These compilers typically perform hyperblock f...
Bertrand A. Maher, Aaron Smith, Doug Burger, Kathr...
In this paper, we consider the optimal rate and power allocation that maximizes a general utility function of average user rates in a fading multiple-access or broadcast channel. B...
Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on ...