In this paper we discuss the design of optimization algorithms for cognitive wireless networks (CWNs). Maximizing the perceived network performance towards applications by selecting appropriate protocols and carrying out cross-layer optimization on the resulting stack is a key functionality of any CWN. We take a "black box" approach to the problem and study the use of simulated annealing for solving it. To improve the convergence rate of the basic algorithm we apply machine learning techniques to construct graphical models on the perceived relations between network stack parameters and application-specific network utilities. We test our optimizer design both in a simulation environment as well as on a network testbed with low-power radios. Our results show that even basic simulated annealing works well, but simple graphical models can further increase the convergence rate. However, use of sophisticated models such as Bayesian networks does not always lead to substantially bet...