Partially observable Markov decision processes (POMDPs) allow one to model complex dynamic decision or control problems that include both action outcome uncertainty and imperfect ...
Support vector machines utilizing the 1-norm, typically set up as linear programs (Mangasarian, 2000; Bradley and Mangasarian, 1998), are formulated here as a completely unconstra...
We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...
We formulate the L2-gain control problem for a general nonlinear, state-space system with projection dynamics in the state evolution and hard constraints on the set of admissible ...
Joseph A. Ball, Martin V. Day, Tungsheng Yu, Pushk...
As shown in [7], optimal control problems with either ODE or PDE dynamics can be solved efficiently using a setting of consistent approximations obtained by numerical discretizati...