Sciweavers

468 search results - page 36 / 94
» Automatic Data Extraction from Data-Rich Web Pages
Sort
View
ICMLA
2008
13 years 9 months ago
A Fully Automatic Crossword Generator
This paper presents a software system that is able to generate crosswords with no human intervention including definition generation and crossword compilation. In particular, the ...
Leonardo Rigutini, Michelangelo Diligenti, Marco M...
KDD
2007
ACM
189views Data Mining» more  KDD 2007»
14 years 8 months ago
Corroborate and learn facts from the web
The web contains lots of interesting factual information about entities, such as celebrities, movies or products. This paper describes a robust bootstrapping approach to corrobora...
Shubin Zhao, Jonathan Betz
WSDM
2009
ACM
172views Data Mining» more  WSDM 2009»
14 years 2 months ago
Clustering the tagged web
Automatically clustering web pages into semantic groups promises improved search and browsing on the web. In this paper, we demonstrate how user-generated tags from largescale soc...
Daniel Ramage, Paul Heymann, Christopher D. Mannin...
ICDE
2008
IEEE
153views Database» more  ICDE 2008»
14 years 9 months ago
Automatically Extracting Form Labels
We describe a machine-learning-based approach for extracting attribute labels from Web form interfaces. Having these labels is a requirement for several techniques that attempt to ...
Hoa Nguyen, Eun Yong Kang, Juliana Freire
WWW
2005
ACM
14 years 8 months ago
Thresher: automating the unwrapping of semantic content from the World Wide Web
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Andrew Hogue, David R. Karger