A website can regulate search engine crawler access to its content using the robots exclusion protocol, specified in its robots.txt file. The rules in the protocol enable the site...
Emerging applications such as personalized portals, enterprise search and web integration systems often require keyword search over semi-structured views. However, traditional inf...
Feng Shao, Lin Guo, Chavdar Botev, Anand Bhaskar, ...
Images are amongst the most widely proliferated form of digital information due to affordable imaging technologies and the Web. In such an environment, the use of digital watermar...
A minimal perfect function maps a static set of keys on to the range of integers {0,1,2, ... , - 1}. We present a scalable high performance algorithm based on random graphs for ...
Kumar Chellapilla, Anton Mityagin, Denis Xavier Ch...
As massive document repositories and knowledge management systems continue to expand, in proprietary environments as well as on the Web, the need for duplicate detection becomes i...