Sciweavers

232 search results - page 13 / 47
» Query-related data extraction of hidden web documents
Sort
View
KDD
2002
ACM
148views Data Mining» more  KDD 2002»
14 years 8 months ago
Discovering informative content blocks from Web documents
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
Shian-Hua Lin, Jan-Ming Ho
CIKM
2005
Springer
14 years 1 months ago
Concept-based interactive query expansion
Despite the recent advances in search quality, the fast increase in the size of the Web collection has introduced new challenges for Web ranking algorithms. In fact, there are sti...
Bruno M. Fonseca, Paulo Braz Golgher, Bruno P&ocir...
SAINT
2005
IEEE
14 years 1 months ago
Learning Logic Wrappers for Information Extraction from the Web
This paper discusses a methodology for applying general-purpose first-order inductive learning to extract information from Web documents structured as unranked ordered trees. The...
Costin Badica, Elvira Popescu, Amelia Badica
DEXA
2005
Springer
109views Database» more  DEXA 2005»
14 years 1 months ago
An XML Approach to Semantically Extract Data from HTML Tables
Abstract. Data intensive information is often published on the internet in the format of HTML tables. Extracting some of the information that is of users’ interest from the inter...
Jixue Liu, Zhuoyun Ao, Ho-Hyun Park, Yongfeng Chen
PVLDB
2010
135views more  PVLDB 2010»
13 years 6 months ago
SXPath - Extending XPath towards Spatial Querying on Web Documents
Querying data from presentation formats like HTML, for purposes such as information extraction, requires the consideration of tree structures as well as the consideration of spati...
Ermelinda Oro, Massimo Ruffolo, Steffen Staab