Search Sciweavers | Sciweavers

708 search results - page 14 / 142

» Identifying Content Blocks from Web Documents

109

click to vote

SIGIR
2004
ACM

135views Information Technology» more SIGIR 2004»

15 years 8 months ago

Query-related data extraction of hidden web documents

Download dis.shef.ac.uk

The larger amount of information on the Web is stored in document databases and is not indexed by general-purpose search engines (i.e., Google and Yahoo). Such information is dyna...

Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...

claim paper

Read More »

131

click to vote

CORR
2010
Springer

106views Education» more CORR 2010»

The WebContent XML Store

15 years 3 months ago

Download www-rocq.inria.fr

In this article, we describe the XML storage system used in the WebContent project. We begin by advocating the use of an XML database in order to store WebContent documents, and w...

Benjamin Nguyen, Spyros Zoupanos

claim paper

Read More »

130

click to vote

ICDCSW
2002
IEEE

130views Computer Networks» more ICDCSW 2002»

Class-Based Delta-Encoding: A Scalable Scheme for Caching Dynamic Web Content

15 years 8 months ago

Download www-rcf.usc.edu

Abstract—Caching static HTTP trafﬁc in proxy-caches has reduced bandwidth consumption and download latency. However, web-caching performance is hard to increase further due to ...

Konstantinos Psounis

claim paper

Read More »

133

click to vote

GISCIENCE
2008
Springer

121views GIS» more GISCIENCE 2008»

Identifying Maps on the World Wide Web

15 years 4 months ago

Download www.isi.edu

Abstract. This paper presents an automatic approach to mining collections of maps from the Web. Our method harvests images from the Web and then classiﬁes them as maps or non-map...

Matthew Michelson, Aman Goel, Craig A. Knoblock

claim paper

Read More »

121

click to vote

PKDD
2004
Springer

91views Data Mining» more PKDD 2004»

Summarization of Dynamic Content in Web Collections

15 years 8 months ago

Download www.miv.t.u-tokyo.ac.jp

This paper describes a new research proposal of multi-document summarization of dynamic content in web pages. Much information is lost in the Web due to the temporal character of w...

Adam Jatowt, Mitsuru Ishizuka

claim paper

Read More »

« Prev « First page 14 / 142 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers