—In this paper we develop an adaptive learning algorithm which is approximately optimal for an opportunistic spectrum access (OSA) problem with polynomial complexity. In this OSA...
POMDPs are the models of choice for reinforcement learning (RL) tasks where the environment cannot be observed directly. In many applications we need to learn the POMDP structure ...
Recommender systems are an important component of many websites. Two of the most popular approaches are based on matrix factorization (MF) and Markov chains (MC). MF methods learn...
Steffen Rendle, Christoph Freudenthaler, Lars Schm...
1 In the classic educational context, observing and identifying learner's emotional response allow the teacher to adapt the lesson, with the aim of improving the quality of th...
Reasoning about agents that we observe in the world is challenging. Our available information is often limited to observations of the agent’s external behavior in the past and p...
H. Van Dyke Parunak, Sven Brueckner, Robert S. Mat...