Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
— We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially Observable Mark...
Recent psychological and neurological evidence suggests that biological object recognition is a process of matching sensed images to stored iconic memories. This paper presents a p...
Abstract. Reasoning plays a central role in intelligent systems that operate in complex situations that involve time constraints. In this paper, we present the Adaptive Logic Inter...
Nima Asgharbeygi, Negin Nejati, Pat Langley, Sachi...
Abstract-- In order to increase the transportation capability of elevator group systems in high-rise buildings without adding elevator installation space, double-deck elevator syst...
Jin Zhou, Lu Yu, Shingo Mabu, Kotaro Hirasawa, Jin...