We propose a new machine learning paradigm called Graph Transformer Networks that extends the applicability of gradient-based learning algorithms to systems composed of modules that take graphs as inputs and produce graphs as output. Training is performed by computing gradients of a global objective function with respect to all the parameters in the system using a kind of back-propagation procedure. A complete check reading system based on these concepts is described. The system uses convolutional neural network character recognizers, combined with global training techniques to provides record accuracy on business and personal checks. It is presently deployed commercially and reads million of checks per month.