Deriving a thematically meaningful partition of an unlabeled document corpus is a challenging task. In this context, the use of document representations based on latent thematic ge...
Versioned document collections are collections that contain multiple versions of each document. Important examples are Web archives, Wikipedia and other wikis, or source code and ...
This paper presents a novel approach for content-based analysis of karaoke music, which utilizes multimodal contents including synchronized lyrics text from the video channel and ...
Conventional segmentation technique cannot extract difficult characters such as an isolated character and a touching character. In this paper, we propose a novel character recogn...
Information retrieval needs to match relevant texts with a given query. Selecting appropriate parts is useful when documents are long, and only portions are interesting to the user...