Sciweavers

1042 search results - page 151 / 209
» Logic-based Web Information Extraction
Sort
View
WWW
2003
ACM
14 years 9 months ago
Efficient URL caching for world wide web crawling
Crawling the web is deceptively simple: the basic algorithm is (a) Fetch a page (b) Parse it to extract all linked URLs (c) For all the URLs not seen before, repeat (a)?(c). Howev...
Andrei Z. Broder, Marc Najork, Janet L. Wiener
WSE
2006
IEEE
14 years 3 months ago
Modeling Request Routing in Web Applications
For web applications, determining how requests from a web page are routed through server components can be time-consuming and error-prone due to the complex set of rules and mecha...
Minmin Han, Christine Hofmeister
ECIR
2008
Springer
13 years 10 months ago
Clustering Template Based Web Documents
More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
Thomas Gottron
JCDL
2011
ACM
301views Education» more  JCDL 2011»
12 years 12 months ago
Archiving the web using page changes patterns: a case study
A pattern is a model or a template used to summarize and describe the behavior (or the trend) of a data having generally some recurrent events. Patterns have received a considerab...
Myriam Ben Saad, Stéphane Gançarski
SIGCOMM
2009
ACM
14 years 3 months ago
SMS-based contextual web search
SMS-based web search is different from traditional web search in that the final response to a search query is limited to a very small number of bytes (typically 1-2 SMS messages...
Jay Chen, Brendan Linn, Lakshminarayanan Subramani...