We calculate the Shannon entropy rate of a binary Hidden Markov Process (HMP), of given transition rate and noise (emission), as a series expansion in . The first two orders are ca...
— We consider decision-making problems in Markov decision processes where both the rewards and the transition probabilities vary in an arbitrary (e.g., nonstationary) fashion. We...
Explaining policies of Markov Decision Processes (MDPs) is complicated due to their probabilistic and sequential nature. We present a technique to explain policies for factored MD...
This paper investigates how to automatically create a dialogue control component of a listening agent to reduce the current high cost of manually creating such components. We coll...
Toyomi Meguro, Ryuichiro Higashinaka, Yasuhiro Min...
A key problem in reinforcement learning is finding a good balance between the need to explore the environment and the need to gain rewards by exploiting existing knowledge. Much ...