We consider the problem of finding duplicates in data streams. Duplicate detection in data streams is utilized in various applications including fraud detection. We develop a solu...
Web data integration is an important preprocessing step for web mining. It is highly likely that several records on the web whose textual representations differ may represent the ...
The increasing popularity of social networks has initiated a fertile research area in information extraction and data mining. Although such analysis can facilitate better understan...
Abstract. Since current search engines employ link-based ranking algorithms as an important tool to decide a ranking of sites, Web spammers are making a significant effort to man...
Abstract. In this paper, we propose an efficient method for finding consistent correspondences between two sets of features. Our matching algorithm augments the discriminative pow...