A key problem in reinforcement learning is finding a good balance between the need to explore the environment and the need to gain rewards by exploiting existing knowledge. Much ...
We present a technique for computing approximately optimal solutions to stochastic resource allocation problems modeled as Markov decision processes (MDPs). We exploit two key pro...
Nicolas Meuleau, Milos Hauskrecht, Kee-Eung Kim, L...
Decentralized Markov Decision Processes (DEC-MDPs) are a popular model of agent-coordination problems in domains with uncertainty and time constraints but very difficult to solve...
This paper provides a technique, based on partially observable Markov decision processes (POMDPs), for building automatic recovery controllers to guide distributed system recovery...
Kaustubh R. Joshi, William H. Sanders, Matti A. Hi...
— We consider opportunistic spectrum access (OSA) which allows secondary users to identify and exploit instantaneous spectrum opportunities resulting from the bursty traffic of ...