Policy Gradient Planning for Environmental Decision Making with Existing Simulators

14 years 6 months ago

Download www.cs.ubc.ca

In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action spaces, spatial correlation between actions, uncertainty and complex utility models. We present an approach for modeling these planning problems as factored Markov decision processes. The reward model can contain local and global components as well as spatial constraints between locations. The transition dynamics can be provided by existing simulators developed by domain experts. We propose a landscape policy deﬁned as the equilibrium distribution of a Markov chain built from many locally-parameterized policies. This policy is optimized using a policy gradient algorithm. Experiments using a forestry simulator demonstrate the algorithm’s ability to devise policies for sustainable harvest planning of a forest.

Mark Crowley, David Poole

Real-time Traffic

AAAI 2011 | Equilibrium Distribution | Gradient Algorithm | Intelligent Agents | Spatial Constraints |

claim paper

» Supporting agile modeling through experimentation in an integrated urban simulation framew...

» Improving adjustable autonomy strategies for timecritical domains

» Hyperscenarios a framework for active narrative

Post Info
More Details (n/a)

Added	12 Dec 2011
Updated	12 Dec 2011
Type	Journal
Year	2011
Where	AAAI
Authors	Mark Crowley, David Poole

Comments (0)

Sciweavers

Policy Gradient Planning for Environmental Decision Making with Existing Simulators

AAAI 2011 | Equilibrium Distribution | Gradient Algorithm | Intelligent Agents | Spatial Constraints |

Explore & Download

Productivity Tools

Sciweavers