Traditional co-clustering methods identify block structures from static data matrices. However, the data matrices in many applications are dynamic; that is, they evolve smoothly o...
Many agent problems in a grid world have a restricted sensory information and motor actions. The environmental conditions need dynamic processing of internal memory. In this paper...
E-commerce has transformed the way firms develop their pricing strategies, producing shift away from fixed pricing to dynamic pricing. In this paper, we use two different Estim...
Siddhartha Shakya, Fernando Oliveira, Gilbert Owus...
MDPs are an attractive formalization for planning, but realistic problems often have intractably large state spaces. When we only need a partial policy to get from a fixed start s...
H. Brendan McMahan, Maxim Likhachev, Geoffrey J. G...
The success ofreinforcement learninginpractical problems depends on the ability to combine function approximation with temporal di erence methods such as value iteration. Experime...