Traditional hop-by-hop dynamic routing makes inefficient use of network resources as it forwards packets along already congested shortest paths while uncongested longer paths may be underutilized. To maintain network-wide load balancing, we propose Autonomous Network management with Team learning based Self-configuration (ANTS) which attempts to manage a feasible route for traffic flow with QoS constraints in heterogeneous networks. To enable cognitive intelligence for network-wide load balancing, we implement a cross-layer mechanism in which learning agents in middleware layer can monitor the queue sizes of MAC layer, thereby allowing for the discovery of optimal routes. We present OPNET simulation results illustrating that, in comparison to original OSPF and AODV (2.18 Mbits/s with 46.46% packet loss rate), ANTS dramatically achieves a higher packet delivery (9.57 Mbits/s with 0.53% packet loss rate). Keywords-cognitive networks, load balancing, reinforcement learning, wireless netwo...