Recent advances in coherent and convex demodulation have proven useful for analyzing and modifying the low-frequency envelope structure of speech. This paper reports the application of both methods, referred to here as bandwidthconstrained demodulation, to large-scale speech recognition in the form of new feature representations. Modulation-based features yielded measurable improvement when included as complementary sources of information with a baseline recognizer. Furthermore, both sets of demodulation features showed promise for outperforming the conventional Hilbert envelope method which underlies most modern speech recognition features. These experimental results show the potential for further development in feature representations based on recently-developed bandwidth-constrained modulation signal models.
Pascal Clark, Gregory Sell, Les E. Atlas