This article proposes an algorithm to automatically learn useful transformations of data to improve accuracy in supervised classification tasks. These transformations take the form of a mixture of base transformations and are learned by maximizing the kernel alignment criterion. Because the proposed optimization is nonconvex, a semidefinite relaxation is derived to find an approximate global solution. This new convex algorithm learns kernels made up of a matrix mixture of transformations. This formulation yields a simpler optimization while achieving comparable or improved accuracies to previous transformation learning algorithms based on maximizing the margin. Remarkably, the new optimization problem does not slow down with the availability of additional data allowing it to scale to large datasets. One application of this method is learning monotonic transformations constructed from a base set of truncated ramp functions. These monotonic transformations permit a nonlinear filtering o...