We introduce the ALeRT (Action-dependent Learning Rates with Trends) algorithm that makes two modifications to the learning rate and one change to the exploration rate of traditio...
Maria Cutumisu, Duane Szafron, Michael H. Bowling,...
Exploration/Exploitation equilibrium is one of the most challenging issues in reinforcement learning area as well as learning classifier systems such as XCS. In this paper1 , an i...