Sciweavers

ICML
2008
IEEE

Non-parametric policy gradients: a unified treatment of propositional and relational domains

15 years 10 days ago
Non-parametric policy gradients: a unified treatment of propositional and relational domains
Policy gradient approaches are a powerful instrument for learning how to interact with the environment. Existing approaches have focused on propositional and continuous domains only. Without extensive feature engineering, it is difficult ? if not impossible ? to apply them within structured domains, in which e.g. there is a varying number of objects and relations among them. In this paper, we describe a non-parametric policy gradient approach ? called NPPG ? that overcomes this limitation. The key idea is to apply Friedmann's gradient boosting: policies are represented as a weighted sum of regression models grown in an stage-wise optimization. Employing off-the-shelf regression learners, NPPG can deal with propositional, continuous, and relational domains in a unified way. Our experimental results show that it can even improve on established results.
Kristian Kersting, Kurt Driessens
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2008
Where ICML
Authors Kristian Kersting, Kurt Driessens
Comments (0)