The early success of link-based ranking algorithms was predicated on the assumption that links imply merit of the target pages. However, today many links exist for purposes other ...
In the present paper, we present the theoretical basis, as well as an empirical validation, of a protocol designed to obtain effective VC dimension estimations in the case of a si...
Many organizations today have more than very large databases; they have databases that grow without limit at a rate of several million records per day. Mining these continuous dat...
Statistical machine learning techniques for data classification usually assume that all entities are i.i.d. (independent and identically distributed). However, real-world entities...
Image classification is a well-studied and hard problem in computer vision. We extend a proven solution for classifying web spam to handle images. We exploit the link structure of...