Recent work in deduplication has shown that collective deduplication of different attribute types can improve performance. But although these techniques cluster the attributes col...
Metasearch engine, Comparison-shopping and Deep Web crawling applications need to extract search result records enwrapped in result pages returned from search engines in response ...
The primary purpose of news articles is to convey information about who, what, when and where. But learning and summarizing these relationships for collections of thousands to mil...
David Newman, Chaitanya Chemudugunta, Padhraic Smy...
In this paper, we study the problem of discovering interesting patterns through user's interactive feedback. We assume a set of candidate patterns (i.e., frequent patterns) h...
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This p...