—We consider representing a short temporal fragment of musical audio as a dynamic texture, a model of both the timbral and rhythmical qualities of sound, two of the important asp...
Luke Barrington, Antoni B. Chan, Gert R. G. Lanckr...
— Extracting the main melody from a polyphonic music recording seems natural even to untrained human listeners. To a certain extent it is related to the concept of source separat...
Abstract—We present an algorithm for removing timefrequency components, found by a standard Gabor transform, of a “real-world” sound while causing no audible difference to th...
Abstract--We describe some high-level approaches to estimating confidence scores for the words output by a speech recognizer. By "high-level" we mean that the proposed me...
When automatic speech recognition (ASR) and speaker verification (SV) are applied in adverse acoustic environments, endpoint detection and energy normalization can be crucial to th...