Sciweavers

498 search results - page 1 / 100
» Robust web content extraction
Sort
View
WWW
2006
ACM
14 years 7 months ago
Robust web content extraction
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...
RIAO
2007
13 years 8 months ago
A Robust Linguistic Platform for Efficient and Domain specific Web Content Analysis
Web semantic access in specific domains calls for specialized search engines with enhanced semantic querying and indexing capacities, which pertain both to information retrieval (...
Thierry Hamon, Adeline Nazarenko, Thierry Poibeau,...
WWW
2005
ACM
14 years 7 months ago
Hybrid semantic tagging for information extraction
The semantic web is expected to have an impact at least as big as that of the existing HTML based web, if not greater. However, the challenge lays in creating this semantic web an...
Ronen Feldman, Binyamin Rosenfeld, Moshe Fresko, B...
BIS
2006
106views Business» more  BIS 2006»
13 years 8 months ago
Expected Utility of Content Blocks in Web Content Extraction
In this paper we discuss the possible application of new concepts in web content extraction: utility assessment, utility annealing, and dynamic aggregated document generation. Aft...
Marek Kowalkiewicz
ICWE
2010
Springer
13 years 5 months ago
Partial Information Extraction Approach to Lightweight Integration on the Web
Abstract. We present partial information extraction approach to lightweight integration on the Web. Our approach allows us to extract dynamic contents created by scripts as well as...
Junxia Guo, Prach Chaisatien, Hao Han, Tomoya Noro...