The algorithmic framework developed for improving heuristic solutions of the new version of deterministic TSP [Choi et al., 2002] is extended to the stochastic case. To verify the...
We consider reinforcement learning as solving a Markov decision process with unknown transition distribution. Based on interaction with the environment, an estimate of the transit...
Decentralized Markov decision processes are frequently used to model cooperative multi-agent systems. In this paper, we identify a subclass of general DEC-MDPs that features regul...
—Cooperative communication has attracted dramatic attention in the last few years due to its advantage in mitigating channel fading. Despite much effort that has been made in the...
It is well known that stochastic control systems can be viewed as Markov decision processes (MDPs) with continuous state spaces. In this paper, we propose to apply the policy iter...