We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an informationtheoretic ...
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Gho...
In this paper, we consider the problem of planning and learning in the infinite-horizon discounted-reward Markov decision problems. We propose a novel iterative direct policysearc...
In this paper, we propose a novel formulation of the network clique detection problem by introducing a general network data representation framework. We show connections between o...
Xiaoye Jiang, Yuan Yao, Han Liu, Leonidas J. Guiba...
Abstract. In this paper we unify divergence minimization and statistical inference by means of convex duality. In the process of doing so, we prove that the dual of approximate max...