The Machine Learning and Pattern Recognition communities are facing two challenges: solving the normalization problem, and solving the deep learning problem. The normalization problem is related to the difficulty of training probabilistic models over large spaces while keeping them properly normalized. In recent years, the ML and Natural Language communities have devoted considerable efforts to circumventing this problem by developing “unnormalized” learning models for tasks in which the output is highly structured (e.g. English sentences). This class of models was in fact originally developed during the 90’s in the handwriting recognition community, and includes Graph Transformer Networks, Conditional Random Fields, Hidden Markov SVMs, and Maximum Margin Markov Networks. We describe these models within the unifying framework of ”Energy-Based Models” (EBM). The Deep Learning Problem is related to the issue of training all the levels of a recognition system (e.g. segmentatio...