Sciweavers

263 search results - page 33 / 53
» Re-engineering structures from Web documents
Sort
View
RULEML
2004
Springer
14 years 28 days ago
Rule Learning for Feature Values Extraction from HTML Product Information Sheets
The Web is now a huge information repository with a rich semantic structure that, however, is primarily addressed to human understanding rather than automated processing by a compu...
Costin Badica, Amelia Badica
DAS
2010
Springer
13 years 11 months ago
Analysis and taxonomy of column header categories for web tables
We describe a component of a document analysis system for constructing ontologies for domain-specific web tables imported into Excel. This component automates extraction of the Wa...
Sharad C. Seth, Ramana Chakradhar Jandhyala, Mukka...
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
14 years 2 months ago
Robust web extraction: an approach based on a probabilistic tree-edit model
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Nilesh N. Dalvi, Philip Bohannon, Fei Sha
HICSS
2002
IEEE
113views Biometrics» more  HICSS 2002»
14 years 16 days ago
Persona: A Contextualized and Personalized Web Search
Abstract— Recent advances in graph-based search techniques derived from Kleinberg’s work [1] have been impressive. This paper further improves the graph-based search algorithm ...
Francisco Tanudjaja, Lik Mu
ICADL
2007
Springer
129views Education» more  ICADL 2007»
14 years 1 months ago
Using Automatic Metadata Extraction to Build a Structured Syllabus Repository
Syllabi are important documents created by instructors for students. Students use syllabi to find information and to prepare for class. Instructors often need to find similar syl...
Xiaoyan Yu, Manas Tungare, Weiguo Fan, Manuel A. P...