The class PCP(f(n), g(n)) consists of all languages L for which there exists a polynomial-time probabilistic oracle machine that uses O(f(n)) random bits, queries O(g(n)) bits of ...
Although tabular reinforcement learning (RL) methods have been proved to converge to an optimal policy, the combination of particular conventional reinforcement learning techniques...
We introduce a Bayesian model, BayesANIL, that is capable of estimating uncertainties associated with the labeling process. Given a labeled or partially labeled training corpus of...
Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past, which is an essential problem for physically grounded AI as experiments are us...
We prove that the edges of every graph of bounded (Euler) genus can be partitioned into any prescribed number k of pieces such that contracting any piece results in a graph of bou...
Erik D. Demaine, MohammadTaghi Hajiaghayi, Bojan M...