We investigate the task of finding links from Wikipedia pages to external web pages. Such external links significantly extend the information in Wikipedia with information from ...
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Many applications in information retrieval, natural language processing, data mining, and related fields require a ranking of instances with respect to a specified criteria as op...
: Embedded Computer-based Systems are becoming highly complex and hard to implement because of the large number of concerns the designers have to address. These systems are tightly...
High-level spoken document analysis is required in many applications seeking access to the semantic content of audio data, such as information retrieval, machine translation or au...
Julien Fayolle, Fabienne Moreau, Christian Raymond...