In this poster, we present an information extraction engine for web-based forums. The engine analyzes the HTML files crawled from web forums, deduces the wrapper (template) of the...
Hanny Yulius Limanto, Nguyen Ngoc Giang, Vo Tan Tr...
Large corpora of 3D models, such as Google 3D Warehouse, are now becoming available on the web. It is possible to search these databases using a keyword search. This makes it poss...
Research on information extraction from Web pages (wrapping) has seen much activity in recent times (particularly systems implementations), but little work has been done on formal...
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...
Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to so...