Non-parametric policy gradients: a unified treatment of propositional and relational domains

15 years 1 months ago

Download www-kd.iai.uni-bonn.de

Policy gradient approaches are a powerful instrument for learning how to interact with the environment. Existing approaches have focused on propositional and continuous domains only. Without extensive feature engineering, it is difficult ? if not impossible ? to apply them within structured domains, in which e.g. there is a varying number of objects and relations among them. In this paper, we describe a non-parametric policy gradient approach ? called NPPG ? that overcomes this limitation. The key idea is to apply Friedmann's gradient boosting: policies are represented as a weighted sum of regression models grown in an stage-wise optimization. Employing off-the-shelf regression learners, NPPG can deal with propositional, continuous, and relational domains in a unified way. Our experimental results show that it can even improve on established results.

Kristian Kersting, Kurt Driessens

Real-time Traffic

Continuous Domains | ICML 2008 | Machine Learning | Off-the-shelf Regression Learners | Policy Gradient Approaches |

claim paper

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2008
Where	ICML
Authors	Kristian Kersting, Kurt Driessens

Comments (0)

Sciweavers

Non-parametric policy gradients: a unified treatment of propositional and relational domains

Continuous Domains | ICML 2008 | Machine Learning | Off-the-shelf Regression Learners | Policy Gradient Approaches |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers