: In many prediction problems, including those that arise in computer security and computational finance, the process generating the data is best modeled as an adversary with whom ...
We consider a general adversarial stochastic optimization model. Our model involves the design of a system that an adversary may subsequently attempt to destroy or degrade. We int...
Matthew D. Bailey, Steven M. Shechter, Andrew J. S...
Abstract. We consider an upper confidence bound algorithm for Markov decision processes (MDPs) with deterministic transitions. For this algorithm we derive upper bounds on the onl...
Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Inter...