Neural networks are a useful alternative to Gaussian mixture models for acoustic modeling; however, training multilayer networks involves a difficult, nonconvex optimization that requires some “art” to make work well in practice. In this paper we investigate the use of arccosine kernels for speech recognition, using these kernels in a hybrid support vector machine/hidden Markov model recognition system. Arccosine kernels approximate the computation in a certain class of infinite neural networks using a single kernel function, but can be used in learners that require only a convex optimization for training. Phone recognition experiments on the TIMIT corpus show that arccosine kernels can outperform radial basis function kernels.