Sciweavers

2876 search results - page 27 / 576
» A Conceptual-Modeling Approach to Extracting Data from the W...
Sort
View
WWW
2010
ACM
13 years 9 months ago
Exploiting content redundancy for web information extraction
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
DEXAW
2007
IEEE
124views Database» more  DEXAW 2007»
13 years 10 months ago
A Process Improvement Approach to Improve Web Form Design and Usability
The research presented in this paper is an examination of how the concepts used in process improvement may be applied to a web form to improve design and usability. Although much ...
Sean Thompson, Torab Torabi
KDD
2007
ACM
189views Data Mining» more  KDD 2007»
14 years 9 months ago
Corroborate and learn facts from the web
The web contains lots of interesting factual information about entities, such as celebrities, movies or products. This paper describes a robust bootstrapping approach to corrobora...
Shubin Zhao, Jonathan Betz
ICDE
2008
IEEE
153views Database» more  ICDE 2008»
14 years 10 months ago
Automatically Extracting Form Labels
We describe a machine-learning-based approach for extracting attribute labels from Web form interfaces. Having these labels is a requirement for several techniques that attempt to ...
Hoa Nguyen, Eun Yong Kang, Juliana Freire
NAACL
2010
13 years 6 months ago
Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment
The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...
Jason R. Smith, Chris Quirk, Kristina Toutanova