Partially observable Markov decision processes (POMDPs) are an intuitive and general way to model sequential decision making problems under uncertainty. Unfortunately, even approx...
Tao Wang, Pascal Poupart, Michael H. Bowling, Dale...
This paper compares the performance of two provably successful evolutionary optimization tools in the optimization of a Fuzzy-Rule-Base (FRB) for the three well known fuzzy modeli...
There has been a lot of recent work on Bayesian methods for reinforcement learning exhibiting near-optimal online performance. The main obstacle facing such methods is that in most...
R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...
Game theory is emerging as a popular tool for distributed control of multiagent systems. In order to take advantage of these game theoretic tools the interactions of the autonomous...