Given the pattern-based multi-predictors of the stock price, we study a method of dynamic asset allocation to maximize the trading performance. To optimize the proportion of asset ...
Jangmin O, Jae Won Lee, Jongwoo Lee, Byoung-Tak Zh...
Reinforcement Learning (RL) is analyzed here as a tool for control system optimization. State and action spaces are assumed to be continuous. Time is assumed to be discrete, yet th...
Reinforcement learning is based on exploration of the environment and receiving reward that indicates which actions taken by the agent are good and which ones are bad. In many app...
We introduce an approach to autonomously creating state space abstractions for an online reinforcement learning agent using a relational representation. Our approach uses a tree-b...
— Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle realworld sequential decision processes but require a known model to be solv...