The problem of measuring similarity between web pages arises in many important Web applications, such as search engines and Web directories. In this paper, we propose a novel neig...
This paper proposes a strategy to reduce the amount of hardware involved in the solution of search engine queries. It proposes using a secondary compact cache that keeps minimal i...
Many users and applications require the integration of semi-structured data from autonomous, heterogeneous Web sources. Over the last years mediator systems have emerged that use d...
In state-of-the-art image retrieval systems, an image is
represented by a bag of visual words obtained by quantizing
high-dimensional local image descriptors, and scalable
schem...
Zhong Wu (Tsinghua University), Qifa Ke (Microsoft...
Snippets are used by almost every text search engine to complement ranking scheme in order to effectively handle user searches, which are inherently ambiguous and whose relevance ...