A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Recent years have witnessed an explosion in the availability of news articles on the World Wide Web. Although searchengines’ algorithms have made it easier to locate these docum...
COSMIC (http://www.sanger.ac.uk/cosmic) curates comprehensive information on somatic mutations in human cancer. Release v48 (July 2010) describes over 136 000 coding mutations in ...
Simon A. Forbes, Nidhi Bindal, Sally Bamford, Char...
\Web users are nowadays confronted with the huge variety of available information sources whose content is not targeted at any specific group or layer. Recommendation systems aim...
In this paper, we present a novel solution to the image annotation problem which annotates images using search and data mining technologies. An accurate keyword is required to ini...