Abstract—We propose a routing metric for enabling highthroughput reliable multicast in multi-rate wireless mesh networks. This new multicast routing metric, called expected multi...
Xin Zhao, Jun Guo, Chun Tung Chou, Archan Misra, S...
Abstract. Many reinforcement learning domains are highly relational. While traditional temporal-difference methods can be applied to these domains, they are limited in their capaci...
Trevor Walker, Lisa Torrey, Jude W. Shavlik, Richa...
Abstract— We propose a planning algorithm that allows usersupplied domain knowledge to be exploited in the synthesis of information feedback policies for systems modeled as parti...
Salvatore Candido, James C. Davidson, Seth Hutchin...
Abstract— Research on numerical solution methods for partially observable Markov decision processes (POMDPs) has primarily focused on discrete-state models, and these algorithms ...
Sets of features in Markov decision processes can play a critical role ximately representing value and in abstracting the state space. Selection of features is crucial to the succe...