We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Metasearch engine, Comparison-shopping and Deep Web crawling applications need to extract search result records enwrapped in result pages returned from search engines in response ...
— We present three general approaches to detecting prototypical entities in a given taxonomy and apply them to a music information retrieval (MIR) problem. More precisely, we try...
Abstract. We present results of a new approach to detect destructive article revisions, so-called vandalism, in Wikipedia. Vandalism detection is a one-class classification problem...
To enable optimizations in memory access behavior of high performance applications, cache monitoring is a crucial process. Simulation of cache hardware is needed in order to allow...