Multiple-topic and varying-length of web pages are two negative factors significantly affecting the performance of web search. In this paper, we explore the use of page segmentati...
As user demands become increasingly sophisticated, search engines today are competing in more than just returning document results from the Web. One area of competition is providi...
This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia, a structured multilingual knowledge base, has been highly exploited in many monolingual...
Abstract. Searching specialized collections, such as biomedical literature, typically requires intimate knowledge of a specialized terminology. Hence, it can be a disappointing exp...
Recent research has had some success using the length of time a user displays a document in their web browser as implicit feedback for document preference. However, most studies h...