We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Many problems of multiagent planning under uncertainty require distributed reasoning with continuous resources and resource limits. Decentralized Markov Decision Problems (Dec-MDP...
In this paper, we introduce a new technique for modeling and solving the dynamic power management (DPM) problem for systems with complex behavioral characteristics such as concurr...
We address the problem of optimally controlling stochastic environments that are partially observable. The standard method for tackling such problems is to define and solve a Part...
Meta-level control manages the allocation of limited resources to deliberative actions. This paper discusses efforts in adding meta-level control capabilities to a Markov Decision...