In this paper we show how game theory and Gibbs sampling techniques can be used to design a self-optimizing algorithm for minimizing end-to-end delays for all flows in a multi-class mobile ad hoc network (MANET). This is an improvement over the famed Ad-Hoc On-demand Distance Vector (AODV) protocol, that computes the routes with minimal number of hops for each flow in a multi-flow ad-hoc network. Here, the load of each flow is taken into account to choose the best route (in terms of delays) among a fixed number of routes. The algorithm can be implemented in a fully distributed and asynchronous way and is guaranteed to converge to the global optimal configuration. Numerous numerical experiments show that the gain over AODV, computed over a large number of networks, is quite substantial.