Sciweavers

416 search results - page 6 / 84
» Structured Web Pages Management for Efficient Data Retrieval
Sort
View
SIGIR
2004
ACM
14 years 26 days ago
Query-related data extraction of hidden web documents
The larger amount of information on the Web is stored in document databases and is not indexed by general-purpose search engines (i.e., Google and Yahoo). Such information is dyna...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...
WWW
2007
ACM
14 years 8 months ago
Adaptive record extraction from web pages
We describe an adaptive method for extracting records from web pages. Our algorithm combines a weighted tree matching metric with clustering for obtaining data extraction patterns...
Justin Park, Denilson Barbosa
KDD
2003
ACM
161views Data Mining» more  KDD 2003»
14 years 7 months ago
Eliminating noisy information in Web pages for data mining
A commercial Web page typically contains many information blocks. Apart from the main content blocks, it usually has such blocks as navigation panels, copyright and privacy notice...
Lan Yi, Bing Liu, Xiaoli Li
WWW
2004
ACM
14 years 8 months ago
Efficient web change monitoring with page digest
The Internet and the World Wide Web have enabled a publishing explosion of useful online information, which has produced the unfortunate side effect of information overload: it is...
David Buttler, Daniel Rocco, Ling Liu
WIDM
2006
ACM
14 years 1 months ago
Coarse-grained classification of web sites by their structural properties
In this paper, we identify and analyze structural properties which reflect the functionality of a Web site. These structural properties consider the size, the organization, the co...
Christoph Lindemann, Lars Littig