Abstract. Recent research has suggested that there is no general similarity measure, which can be applied on arbitrary databases without any parameterization. Hence, the optimal co...
Search engine switching describes the voluntarily transition from one Web search engine to another. In this paper we present a study of search engine switching behavior that combi...
Record linkage is an important data integration task that has many practical uses for matching, merging and duplicate removal in large and diverse databases. However, a quadratic ...
Timothy de Vries, Hui Ke, Sanjay Chawla, Peter Chr...
Abstract. Information retrieval from web and XML document collections is ever more focused on returning entities instead of web pages or XML elements. There are many research field...
Jovan Pehcevski, Anne-Marie Vercoustre, James A. T...
This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia, a structured multilingual knowledge base, has been highly exploited in many monolingual...