Sciweavers

543 search results - page 58 / 109
» Exploiting content redundancy for web information extraction
Sort
View
PKDD
2007
Springer
120views Data Mining» more  PKDD 2007»
14 years 1 months ago
Site-Independent Template-Block Detection
Detection of template and noise blocks in web pages is an important step in improving the performance of information retrieval and content extraction. Of the many approaches propos...
Aleksander Kolcz, Wen-tau Yih
SIGMOD
2004
ACM
150views Database» more  SIGMOD 2004»
14 years 7 months ago
When one Sample is not Enough: Improving Text Database Selection Using Shrinkage
Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the databas...
Panagiotis G. Ipeirotis, Luis Gravano
CIKM
2009
Springer
14 years 2 months ago
Easiest-first search: towards comprehension-based web search
Although Web search engines have become information gateways to the Internet, for queries containing technical terms, search results often contain pages that are difficult to be ...
Makoto Nakatani, Adam Jatowt, Katsumi Tanaka
WWW
2010
ACM
14 years 2 months ago
Shout out: integrating news and reader comments
A useful approach for enabling computers to automatically create new content is utilizing the text, media, and information already present on the World Wide Web. The newly created...
Lisa M. Gandy, Nathan D. Nichols, Kristian J. Hamm...
JASIS
2006
106views more  JASIS 2006»
13 years 7 months ago
Web unit-based mining of homepage relationships
Abstract Homepages usually describe important semantic information about conceptual or physical entities, and are hence the main targets for searching and browsing. To facilitate s...
Aixin Sun, Ee-Peng Lim