Many scientific, financial, data mining and sensor network applications need to work with continuous, rather than discrete data e.g., temperature as a function of location, or sto...
One essential issue of document clustering is to estimate the appropriate number of clusters for a document collection to which documents should be partitioned. In this paper, we ...
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
Many database applications require the analysis and processing of data streams. In such systems, huge amounts of data arrive rapidly and their values change over time. The variati...
Lv-an Tang, Bin Cui, Hongyan Li, Gaoshan Miao, Don...
Viral marketing takes advantage of networks of influence among customers to inexpensively achieve large changes in behavior. Our research seeks to put it on a firmer footing by mi...