Different from familiar clustering objects, text documents have sparse data spaces. A common way of representing a document is as a bag of its component words, but the semantic re...
Abstract. Many keyword-based approaches to text classification, information retrieval or even user modeling for adaptive web-based system could benefit from knowledge on relation...
K-Means clustering is widely used in information retrieval and data mining. Distributed K-Means variants have already been proposed, but none of the past algorithms scales to large...
Odysseas Papapetrou, Wolf Siberski, Fabian Leitrit...
We have aligned Japanese and English news articles and sentences to make a large parallel corpus. We first used a method based on cross-language information retrieval (CLIR) to a...
For more efficient organizing, browsing, and retrieving digital video content, it is important to extract video structure information at both scene and shot levels. This paper pre...