This paper addresses the problem of discovering conversational group dynamics from nonverbal cues extracted from thin-slices of interaction. We first propose and analyze a novel t...
We describe the objectives and organization of the CLEF 2006 ad hoc track and discuss the main characteristics of the tasks offered to test monolingual, bilingual, and multilingual...
Giorgio Maria Di Nunzio, Nicola Ferro, Thomas Mand...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...