We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...
—This paper addresses the problem of robust template tracking in image sequences. Our work falls within the discriminative framework in which the observations at each frame yield...
We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...
This paper uses partially observable Markov decision processes (POMDP’s) as a basic framework for MultiAgent planning. We distinguish three perspectives: first one is that of a...
Bharaneedharan Rathnasabapathy, Piotr J. Gmytrasie...
We describe a formal framework for diagnosis and repair problems that shares elements of the well known partially observable MDP and cost-sensitive classification models. Our cost...
Michael L. Littman, Nishkam Ravi, Eitan Fenson, Ri...