Reinforcement learning problems are commonly tackled with temporal difference methods, which use dynamic programming and statistical sampling to estimate the long-term value of ta...
We present a novel approach to natural language generation (NLG) that applies hierarchical reinforcement learning to text generation in the wayfinding domain. Our approach aims to...
Coordinating agents in a complex environment is a hard problem, but it can become even harder when certain characteristics of the tasks, like the required number of agents, are un...
— Existing searching schemes in unstructured P2Ps can be categorized as either blind or informed. The quality of query results in blind schemes is low. Informed schemes use simpl...
— We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially Observable Mark...