The search space of Bayesian Network structures is usually defined as Acyclic Directed Graphs (DAGs) and the search is done by local transformations of DAGs. But the space of Baye...
We empirically evaluate the performance of various reinforcement learning methods in applications to sequential targeted marketing. In particular, we propose and evaluate a progre...
Naoki Abe, Edwin P. D. Pednault, Haixun Wang, Bian...
Decentralized Markov decision processes are frequently used to model cooperative multi-agent systems. In this paper, we identify a subclass of general DEC-MDPs that features regul...
Basis functions derived from an undirected graph connecting nearby samples from a Markov decision process (MDP) have proven useful for approximating value functions. The success o...
Reinforcement learning is based on exploration of the environment and receiving reward that indicates which actions taken by the agent are good and which ones are bad. In many app...