There has been considerable interest in random projections, an approximate algorithm for estimating distances between pairs of points in a high-dimensional vector space. Let A Rn...
Tasks of data mining and information retrieval depend on a good distance function for measuring similarity between data instances. The most effective distance function must be for...
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...
Multimedia data mining requires the ability to automatically analyze and understand the content. The Community of Multimedia Agents project is devoted to creating a community of re...
Domain-oriented sentiment lexicons are widely used for finegrained sentiment analysis on reviews; therefore, the automatic construction of domain-oriented sentiment lexicon is a f...