Providers such as YouTube offer easy access to multimedia content to millions, generating high bandwidth and storage demand on the Content Delivery Networks they rely upon. More ...
The advent of e-commerce has created a trend that brought thousands of catalogs online. Most of these websites are “taxonomy-directed”. A Web site is said to be ``taxonomydire...
We present two machine learning approaches to information extraction from semi-structured documents that can be used if no annotated training data are available, but there does ex...
In this paper, we continue our investigations of "web spam": the injection of artificially-created pages into the web in order to influence the results from search engin...
Alexandros Ntoulas, Marc Najork, Mark Manasse, Den...
The proliferation of content-based image retrieval techniques has highlighted the need to understand the relationship between image clustering based on low-Ievel imagefeatures and...