In the classic multi-armed bandits problem, the goal is to have a policy for dynamically operating arms that each yield stochastic rewards with unknown means. The key metric of int...
The stochastic knapsack has been used as a model in wide ranging applications from dynamic resource allocation to admission control in telecommunication. In recent years, a variat...
Link spam is used to increase the ranking of certain target web pages by misleading the connectivity-based ranking algorithms in search engines. In this paper we study how web pag...
Abstract--The rectilinear Steiner tree (RST) problem is of essential importance to the automatic interconnect optimization for VLSI design. In this paper, we present a class of pro...
Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maxi...