This work studies the control of robots in the adversarial world of "Hunt the Wumpus". The hybrid learning algorithm which controls the robots behavior is a combination of a modified RPNI algorithm, and a utility update algorithm. The modified RPNI algorithm is a DFA learning algorithm, to learn opponents' strategies. An utility update algorithm is used to quickly derive a successful conclusion to the mission of the agent using information gleaned from the modified RPNI.1
Gilbert L. Peterson, Diane J. Cook