Relational world models that can be learned from experience in stochastic domains have received significant attention recently. However, efficient planning using these models rema...
Abstract. We present a first constant performance guarantee for preemptive stochastic scheduling to minimize the sum of weighted completion times. For scheduling jobs with release ...
Working in collaboration with Spain-based retailer Zara, we address the problem of distributing over time a limited amount of inventory across all the stores in a fast-fashion ret...
We introduce the controlled predictive linearGaussian model (cPLG), a model that uses predictive state to model discrete-time dynamical systems with real-valued observations and v...
Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maxi...