We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...
Web semantic access in specific domains calls for specialized search engines with enhanced semantic querying and indexing capacities, which pertain both to information retrieval (...
The semantic web is expected to have an impact at least as big as that of the existing HTML based web, if not greater. However, the challenge lays in creating this semantic web an...
In this paper we discuss the possible application of new concepts in web content extraction: utility assessment, utility annealing, and dynamic aggregated document generation. Aft...
Abstract. We present partial information extraction approach to lightweight integration on the Web. Our approach allows us to extract dynamic contents created by scripts as well as...