We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Many applications in information retrieval, natural language processing, data mining, and related fields require a ranking of instances with respect to a specified criteria as op...
In this paper, we try to leverage a large-scale and multilingual knowledge base, Wikipedia, to help effectively analyze and organize Web information written in different languages...
As the Web provides rich data embedded in the immense contents inside pages, we witness many ad-hoc efforts for exploiting fine granularity information across Web text, such as We...
Personalization has been deemed one of the major challenges in information retrieval with a significant potential for providing better search experience to individual users. Espec...
Julia Luxenburger, Shady Elbassuoni, Gerhard Weiku...