Sciweavers

103 search results - page 7 / 21
» Visual Web Information Extraction with Lixto
Sort
View
AAAI
2007
13 years 11 months ago
Template-Independent News Extraction Based on Visual Consistency
Wrapper is a traditional method to extract useful information from Web pages. Most previous works rely on the similarity between HTML tag trees and induced template-dependent wrap...
Shuyi Zheng, Ruihua Song, Ji-Rong Wen
LREC
2008
133views Education» more  LREC 2008»
13 years 10 months ago
Automatic Identification of Temporal Information in Tourism Web Pages
This paper presents our work on the detection of temporal information in web pages. The pages examined within the scope of this study were taken from the tourism sector and the te...
Stéphanie Weiser, Philippe Laublet, Jean-Lu...
EDBT
2009
ACM
123views Database» more  EDBT 2009»
14 years 3 months ago
High-performance information extraction with AliBaba
A wealth of information is available only in web pages, patents, publications etc. Extracting information from such sources is challenging, both due to the typically complex langu...
Peter Palaga, Long Nguyen, Ulf Leser, Jörg Ha...
WWW
2005
ACM
14 years 9 months ago
Extracting context to improve accuracy for HTML content extraction
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo
ML
2007
ACM
130views Machine Learning» more  ML 2007»
13 years 8 months ago
Interactive learning of node selecting tree transducer
We develop new algorithms for learning monadic node selection queries in unranked trees from annotated examples, and apply them to visually interactive Web information extraction. ...
Julien Carme, Rémi Gilleron, Aurélie...