Abstract. This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organiza...
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
Document clustering has been used for better document retrieval, document browsing, and text mining in digital library. In this paper, we perform a comprehensive comparison study ...
The emergence of the world-wide-web has led to an increased interest in methods for searching for information. A key characteristic of many of the online document collections is t...
To resolve some of lexical disagreement problems between queries and FAQs, we propose a reliable FAQ retrieval system using query log clustering. On indexing time, the proposed sy...