A framework for classification and segmentation of massive audio data streams

16 years 7 months ago

Download www.charuaggarwal.net

In recent years, the proliferation of VOIP data has created a number of applications in which it is desirable to perform quick online classification and recognition of massive voice streams. Typically such applications are encountered in real time intelligence and surveillance. In many cases, the data streams can be in compressed format, and the rate of data processing can often run at the rate of Gigabits per second. All known techniques for speaker voice analysis require the use of an offline training phase in which the system is trained with known segments of speech. The state-of-the-art method for text-independent speaker recognition is known as Gaussian Mixture Modeling (GMM), and it requires an iterative Expectation Maximization Procedure for training, which cannot be implemented in real time. In this paper, we discuss the details of such an online voice recognition system. For this purpose, we use our micro-clustering algorithms to design concise signatures of the target speake...

Charu C. Aggarwal

Real-time Traffic

Data Mining | KDD 2007 | Keywords Speaker Recognition | Text-independent Speaker Recognition | Voice Recognition System |

claim paper

» Mixed Type Audio Classification with Support Vector Machine

» Media segmentation using selfsimilarity decomposition

» XCRAB A Content and AnnotationBased Multimedia Indexing and Retrieval System

» Metaclassification Combining Multimodal Classifiers

Post Info
More Details (n/a)

Added	30 Nov 2009
Updated	30 Nov 2009
Type	Conference
Year	2007
Where	KDD
Authors	Charu C. Aggarwal

Comments (0)

Sciweavers

A framework for classification and segmentation of massive audio data streams

Data Mining | KDD 2007 | Keywords Speaker Recognition | Text-independent Speaker Recognition | Voice Recognition System |

Explore & Download

Productivity Tools

Sciweavers