In many topic identification applications, supervised training labels are indirectly related to the semantic content of the documents being classified. For example, many topical...
Web spamming techniques aim to achieve undeserved rankings in search results. Research has been widely conducted on identifying such spam and neutralizing its influence. However,...
Image classification is a well-studied and hard problem in computer vision. We extend a proven solution for classifying web spam to handle images. We exploit the link structure of...
A representation of the World Wide Web as a directed graph, with vertices representing web pages and edges representing hypertext links, underpins the algorithms used by web search...
When dealing with information overload from the Internet, such as the classification of Web pages and the filtering of email spam, a new technique called cotraining has been shown...