Sciweavers

131 search results - page 8 / 27
» Ranking-Constrained Keyword Sequence Extraction from Web Doc...
Sort
View
SIGIR
2002
ACM
13 years 6 months ago
Finding relevant documents using top ranking sentences: an evaluation of two alternative schemes
In this paper we present an evaluation of techniques that are designed to encourage web searchers to interact more with the results of a web search. Two specific techniques are ex...
Ryen White, Ian Ruthven, Joemon M. Jose
WWW
2005
ACM
14 years 7 months ago
Thresher: automating the unwrapping of semantic content from the World Wide Web
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Andrew Hogue, David R. Karger
WWW
2009
ACM
14 years 7 months ago
Estimating web site readability using content extraction
Nowadays, information is primarily searched on the WWW. From a user perspective, the readability is an important criterion for measuring the accessibility and thereby the quality ...
Thomas Gottron, Ludger Martin
WWW
2009
ACM
13 years 11 months ago
Extracting data records from the web using tag path clustering
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...
Gengxin Miao, Jun'ichi Tatemura, Wang-Pin Hsiung, ...
WWW
2009
ACM
14 years 7 months ago
Extracting article text from the web with maximum subsequence segmentation
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
Jeff Pasternack, Dan Roth