Tweets are the most up-to-date and inclusive stream of information and commentary on current events, but they are also fragmented and noisy, motivating the need for systems that c...
Objects with multiple numeric attributes can be compared within any “subspace” (subset of attributes). In applications such as computational journalism, users are interested i...
You Wu, Pankaj K. Agarwal, Chengkai Li, Jun Yang 0...
Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is the bottleneck for virtually all time series d...
Thanawin Rakthanmanon, Bilson J. L. Campana, Abdul...
In recent years, both hashing-based similarity search and multimodal similarity search have aroused much research interest in the data mining and other communities. While hashing-...
The firehose of data generated by users on social networking and microblogging sites such as Facebook and Twitter is enormous. Real-time analytics on such data is challenging wit...
Most existing research about online trust assumes static trust relations between users. As we are informed by social sciences, trust evolves as humans interact. Little work exists...
Jiliang Tang, Huan Liu, Huiji Gao, Atish Das Sarma...
An ideal outcome of pattern mining is a small set of informative patterns, containing no redundancy or noise, that identifies the key structure of the data at hand. Standard freq...
Online advertising is increasingly becoming more performance oriented, where the decision to show an advertisement to a user is made based on the user’s propensity to respond to...
The communities of a social network are sets of vertices with more connections inside the set than outside. We theoretically demonstrate that two commonly observed properties of s...