We use simulated soccer to study multiagent learning. Each team's players (agents) share action set and policy, but may behave di erently due to position-dependent inputs. All...
The UCT algorithm learns a value function online using sample-based search. The TD() algorithm can learn a value function offline for the on-policy distribution. We consider three...
We propose a global algorithm for learning entailment relations between predicates. We define a graph structure over predicates that represents entailment relations as directed ed...
Quadratic program relaxations are proposed as an alternative to linear program relaxations and tree reweighted belief propagation for the metric labeling or MAP estimation problem...
— The task in control allocation is to determine how to generate a specified generalized force from a redundant set of control effectors where the associated actuator control in...