Meta-level control manages the allocation of limited resources to deliberative actions. This paper discusses efforts in adding meta-level control capabilities to a Markov Decision Process (MDP)-based scheduling agent. The agent's reasoning process involves continuous partial unrolling of the MDP state space and periodic reprioritization of the states to be expanded. The meta-level controller makes situation-specific decisions on when the agent should stop unrolling in order to derive a partial policy while bounding the costs of state reprioritization. The described approach uses performance profiling combined with multi-level strategies in its decision making. We present results showing the performance advantage of dynamic meta-level control for this complex agent. Categories and Subject Descriptors I.2.8 [Artificial Intelligence]: Problem Solving, Control Methods, and Search General Terms Algorithms, Performance Keywords bounded rationality, Markov decision process, meta-level c...
George Alexander, Anita Raja, David J. Musliner