This paper describes a multiagent variant of Dyna-Q called M-Dyna-Q. Dyna-Q is an integrated single-agent framework for planning, reacting, and learning. Like DynaQ, M-Dyna-Q employs two key ideas: learning results can serve as a valuable input for both planning and reacting, and results of planning and reacting can serve as a valuable input to learning. M-Dyna-Q extends Dyna-Q in that planning, reacting, and learning are jointly realized by multiple agents.