Partially Observable Markov Decision Processes (POMDP) provide a standard framework for sequential decision making in stochastic environments. In this setting, an agent takes actio...
Learning the reward function of an agent by observing its behavior is termed inverse reinforcement learning and has applications in learning from demonstration or apprenticeship l...
Abstract. One of the objectives of intelligent data engineering and automated learning is to develop algorithms that learn the environment, generate rules, and take possible course...
Learning general truths from the observation of simple domains and, further, learning how to use this knowledge are essential capabilities for any intelligent agent to understand ...
Paulo Santos, Derek R. Magee, Anthony G. Cohn, Dav...
Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor fee...
Michael L. Littman, Anthony R. Cassandra, Leslie P...