Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search...
Continuous state spaces and stochastic, switching dynamics characterize a number of rich, realworld domains, such as robot navigation across varying terrain. We describe a reinfor...
Emma Brunskill, Bethany R. Leffler, Lihong Li, Mic...
Developing dispatching rules for manufacturing systems is a tedious process, which is time- and cost-consuming. Since there is no good general rule for different scenarios and ob...
We present a Simplex-type algorithm, that is, an algorithm that moves from one extreme point of the infinite-dimensional feasible region to another not necessarily adjacent extrem...
We consider a high density of sensors randomly placed in a geographical area for event monitoring. The monitoring regions of the sensors may have significant overlap, and a subset...
Shibo He, Jiming Chen, David K. Y. Yau, Huanyu Sha...