With the proliferation of online classifieds and auctions comes a new need to meaningfully search and organize the items for sale. However, since the seller's item descriptio...
Using an open-source, Java toolkit of name-matching methods, we experimentally compare string distance metrics on the task of matching entity names. We investigate a number of dif...
William W. Cohen, Pradeep D. Ravikumar, Stephen E....
We investigate if the mapping between text and time series data is feasible such that relevant data mining problems in text can find their counterparts in time series (and vice ver...
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...
Numerous widely publicized cases of theft and misuse of private information underscore the need for audit technology to identify the sources of unauthorized disclosure. We present...
Rakesh Agrawal, Alexandre V. Evfimievski, Jerry Ki...