We consider an opportunistic spectrum access (OSA) problem where the time-varying condition of each channel (e.g., as a result of random fading or certain primary users' activ...
Abstract This paper attempts to define an emotional model for virtual agents that behave autonomously in social worlds. We adopt shallow modeling based on the decomposition of the...
We present new algorithms for inverse optimal control (or inverse reinforcement learning, IRL) within the framework of linearlysolvable MDPs (LMDPs). Unlike most prior IRL algorit...
Humans continuously assess one another's situational context, modify their own affective state, and then respond based on these outcomes through empathetic expression. Virtua...
Scott W. McQuiggan, Jennifer L. Robison, Robert Ph...
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...