Current approaches to RDF graph indexing suffer from weak data locality, i.e., information regarding a piece of data appears in multiple locations, spanning multiple data structur...
A suffix tree is a fundamental data structure for string searching algorithms. Unfortunately, when it comes to the use of suffix trees in real-life applications, the current metho...
Marina Barsky, Ulrike Stege, Alex Thomo, Chris Upt...
We study query processing in large graphs that are fundamental data model underpinning various social networks and Web structures. Given a set of query nodes, we aim to find the g...
The standard method for combating spam, either in email or on the web, is to train a classifier on manually labeled instances. As the spammers change their tactics, the performanc...
Deepak Chinavle, Pranam Kolari, Tim Oates, Tim Fin...
The nDCG measure has proven to be a popular measure of retrieval effectiveness utilizing graded relevance judgments. However, a number of different instantiations of nDCG exist, d...
We describe an efficient technique to weigh word-based features in binary classification tasks and show that it significantly improves classification accuracy on a range of proble...
Justin Martineau, Tim Finin, Anupam Joshi, Shamit ...
Social media are becoming increasingly popular and have attracted considerable attention from spammers. Using a sample of more than ninety thousand known spam Web sites, we found ...
Information propagation within the blogosphere is of much importance in implementing policies, marketing research, launching new products, and other applications. In this paper, w...
Solid State Drive (SSD), emerging as new data storage media with high random read speed, has been widely used in laptops, desktops, and data servers to replace hard disk during th...
Data mining can extract important knowledge from large data collections - but sometimes these collections are split among various parties. Privacy concerns may prevent the parties...