The World Wide Web is a large, heterogeneous, distributedcollectionof documents connected by hypertext links. The most common technologycurrently used for searching the Web depend...
Alberto O. Mendelzon, George A. Mihaila, Tova Milo
Document summarization plays an increasingly important role with the exponential growth of documents on the Web. Many supervised and unsupervised approaches have been proposed to ...
Liangda Li, Ke Zhou, Gui-Rong Xue, Hongyuan Zha, Y...
We describe mod_oai, an Apache 2.0 module that implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). The OAI-PMH is the de facto standard for metadata...
Michael L. Nelson, Herbert Van de Sompel, Xiaoming...
The PageRank algorithm, used in the Google search engine, greatly improves the results of Web search by taking into account the link structure of the Web. PageRank assigns to a pa...
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...