Despite the multitude of existing interfaces for annotation, little is known about their influence on the created annotations. In this paper, first findings of a comparative video...
XML is fast becoming the standard format to store, exchange and publish over the web, and is getting embedded in applications. Two challenges in handling XML are its size (the XML...
Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini...
There has been a tremendous growth in the amount of information and resources on the World Wide Web that are useful to researchers and practitioners in science domains. While the ...
Michael Chau, Zan Huang, Jialun Qin, Yilu Zhou, Hs...
This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...
Web spam pages use various techniques to achieve higher-than-deserved rankings in a search engine’s results. While human experts can identify spam, it is too expensive to manual...