A key problem in reinforcement learning is finding a good balance between the need to explore the environment and the need to gain rewards by exploiting existing knowledge. Much ...
The problem of searching for a walker that wants to be found, when the walker moves toward the helicopter when it can hear it, is an example of a two sided search problem which is ...
This paper explores an approach to global, stochastic, simulation optimization which combines stochastic approximation (SA) with simulated annealing (SAN). SA directs a search of ...
Abstract. The utilization of pseudo-random proportional rule to balance between the exploitation and exploration of the search process was shown in Ant Colony System (ACS) algorith...
In bandit problems, a decision-maker must choose between a set of alternatives, each of which has a fixed but unknown rate of reward, to maximize their total number of rewards ov...
Michael D. Lee, Shunan Zhang, Miles Munro, Mark St...