TOP-SURF is an image descriptor that combines interest points with visual words, resulting in a high performance yet compact descriptor that is designed with a wide range of conte...
In this paper we present the World-Wide Web Wrapper Factory (W4F), a Java toolkit to generate wrappers for Web data sources. Some key features of W4F are an expressive language to...
—We introduce a novel set of social network analysis based algorithms for mining the Web, blogs, and online forums to identify trends and find the people launching these new tren...
Peter A. Gloor, Jonas Krauss, Stefan Nann, Kai Fis...
Along with the ever-growing Web comes the proliferation of objectionable content, such as pornography, violence, horror information, etc. Horror videos, whose threat to childrens ...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...