Sciweavers

ICML
2005
IEEE
15 years 7 days ago
Dynamic preferences in multi-criteria reinforcement learning
The current framework of reinforcement learning is based on maximizing the expected returns based on scalar rewards. But in many real world situations, tradeoffs must be made amon...
Sriraam Natarajan, Prasad Tadepalli