Systems designed to extract time-critical information from large volumes of unstructured data must include the ability, both from an architectural and algorithmic point of view, to filter out unimportant data that might otherwise overwhelm the available resources. This paper presents an approach for data filtering to reduce computation in the context of a distributed speech processing architecture designed to detect or identify speakers. Here, filtering means either dropping and ignoring data or passing it on for further processing. The goal of the paper is to show that when the filter is designed to select and pass on a subset of the input data that best preserves the ability to recognize a specific desired speaker, or group of speakers, a large percentage of the data can be ignored while being able to preserve most of the accuracy.
Upendra V. Chaudhari, Olivier Verscheure, Juan Hue