Sciweavers

708 search results - page 14 / 142
» Identifying Content Blocks from Web Documents
Sort
View
SIGIR
2004
ACM
14 years 2 months ago
Query-related data extraction of hidden web documents
The larger amount of information on the Web is stored in document databases and is not indexed by general-purpose search engines (i.e., Google and Yahoo). Such information is dyna...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...
CORR
2010
Springer
106views Education» more  CORR 2010»
13 years 8 months ago
The WebContent XML Store
In this article, we describe the XML storage system used in the WebContent project. We begin by advocating the use of an XML database in order to store WebContent documents, and w...
Benjamin Nguyen, Spyros Zoupanos
ICDCSW
2002
IEEE
14 years 1 months ago
Class-Based Delta-Encoding: A Scalable Scheme for Caching Dynamic Web Content
Abstract—Caching static HTTP traffic in proxy-caches has reduced bandwidth consumption and download latency. However, web-caching performance is hard to increase further due to ...
Konstantinos Psounis
GISCIENCE
2008
Springer
121views GIS» more  GISCIENCE 2008»
13 years 9 months ago
Identifying Maps on the World Wide Web
Abstract. This paper presents an automatic approach to mining collections of maps from the Web. Our method harvests images from the Web and then classifies them as maps or non-map...
Matthew Michelson, Aman Goel, Craig A. Knoblock
PKDD
2004
Springer
91views Data Mining» more  PKDD 2004»
14 years 1 months ago
Summarization of Dynamic Content in Web Collections
This paper describes a new research proposal of multi-document summarization of dynamic content in web pages. Much information is lost in the Web due to the temporal character of w...
Adam Jatowt, Mitsuru Ishizuka