The database tier of dynamic content servers at large Internet sites is typically hosted on centralized and expensive hardware. Recently, research prototypes have proposed using d...
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
In this paper, we present a general architecture and a new retrieval method for an educational system that is based on a knowledge base of existing recorded lectures. The extracti...
In this paper, a system for Named Entity Recognition in the Open domain (NERO) is described. It is concerned with recognition of various types of entity, types that will be approp...
In this paper, we consider the possibility of altering the PageRank of web pages, from an administrator's point of view, through the modification of the PageRank equation. It...