While exploring to nd better solutions, an agent performing online reinforcement learning (RL) can perform worse than is acceptable. In some cases, exploration might have unsafe, ...
Satinder P. Singh, Andrew G. Barto, Roderic A. Gru...
ended abstract summarizes the research presented in Dr. Pardoe’s recently-completed Ph.D. thesis [Pardoe 2011]. The thesis considers how adaptive trading agents can take advantag...
We address the problem of improving the efficiency of natural language text input under degraded conditions (for instance, on mobile computing devices or by disabled users), by ta...
Abstract. The network measurement community has proposed multiple machine learning (ML) methods for traffic classification during the last years. Although several research works ha...
Stochastic gradient descent (SGD) uses approximate gradients estimated from subsets of the training data and updates the parameters in an online fashion. This learning framework i...