Tasks recognizing named entities such as products, people names, or locations from documents have recently received significant attention in the literature. Many solutions to thes...
Enabling an intelligent access to multimedia data requires a powerful description language. In this paper, we demonstrate why the MPEG-7 standard fails to fulfill this task. We i...
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
We present GoGetIt!, a tool for generating structure-driven crawlers that requires a minimum effort from the users. The tool takes as input a sample page and an entry point to a W...
Altigran Soares da Silva, Edleno Silva de Moura, J...
Many applications which use web data extract information from a limited number of regions on a web page. As such, web page division into blocks and the subsequent block classifica...