In its current implementation, the World-Wide Web lacks much of the explicit structure and strong typing found in many closed hypertext systems. While this property has directly f...
Research on information extraction from Web pages (wrapping) has seen much activity in recent times (particularly systems implementations), but little work has been done on formal...
Content-based image search on the Internet is a challenging problem, mostly due to the semantic gap between low-level visual features and high-level content, as well as the excess...
Consumer Generated Medias (CGMs) -- such as blogs, news forums, message boards, and web pages -- are emerging as locations where consumers trade, discuss and influence each other...
Amit Behal, Julia Grace, Linda Kato, Ying Chen, Sh...
The AutoFeed system automatically extracts data from semistructured web sites. Previously, researchers have developed two types of supervised learning approaches for extracting we...