Abstract. Term extraction is an important problem in natural language processing. In this paper, we propose a language independent statistical corpus-based term extraction algorith...
Automatic sentence segmentation of spoken language is an important precursor to downstream natural language processing. Previous studies combine lexical and prosodic features, but...
This work concerns automatic topic segmentation of email conversations. We present a corpus of email threads manually annotated with topics, and evaluate annotator reliability. To...
Shafiq R. Joty, Giuseppe Carenini, Gabriel Murray,...
We propose a simple two-level hierarchical probability model for unsupervised word segmentation. By treating words as strings composed of morphemes/phonemes which are themselves c...
Abstract—Video shot boundary detection is one of the fundamental tasks of video indexing and retrieval applications. Although many methods have been proposed for this task, find...
Duy-Dinh Le, Shin'ichi Satoh, Thanh Duc Ngo, Duc A...