Abstract. Formal semantics for XQuery with side-effects have been proposed in [13, 16]. We propose a different semantics which is better suited for database compilation. We substan...
TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...
—A fundamental challenge of managing mutable data replication in a Peer-to-Peer (P2P) system is how to efficiently maintain consistency under various sharing patterns with heter...
— We study the problem of pricing uplink power in wide-band cognitive radio networks under the objective of revenue maximization for the service provider and while ensuring incen...
Ashraf Al Daoud, Tansu Alpcan, Sachin Kumar Agarwa...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...