Planning in large, partially observable domains is challenging, especially when a long-horizon lookahead is necessary to obtain a good policy. Traditional POMDP planners that plan...
We present Variable Influence Structure Analysis, or VISA, an algorithm that performs hierarchical decomposition of factored Markov decision processes. VISA uses a dynamic Bayesia...
Conditional random fields (CRFs) are graphical models for modeling the probability of labels given the observations. They have traditionally been trained with using a set of obser...
Xinhua Zhang, Douglas Aberdeen, S. V. N. Vishwanat...
In a M/M/N+M queue, when there are many customers waiting, it may be preferable to reject a new arrival rather than risk that arrival later abandoning without receiving service. O...
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...