This paper describes IBCOW Image-based Classi cation of Objectionable Websites, a system capable of classifying a website as objectionable or benign based on image content. The sys...
James Ze Wang, Jia Li, Gio Wiederhold, Oscar Firsc...
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many...
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C...
In this paper, we present a fast and scalable Bayesian model for improving weakly annotated data – which is typically generated by a (semi) automated information extraction (IE) ...
Recently, semantic text portion (STP) is getting popular in the field of Web mining. STP is a text portion in the original page which is semantically related to the anchor pointing...
It is necessary to provide a method to store Web information effectively so it can be utilised as a future knowledge resource. A commonly adopted approach is to classify the retri...