Abstract-Wikipedia is an example of the collaborative, semi-structured data sets emerging on the Web. These data sets have large, nonuniform schema that require costly data integra...
Bryan Chan, Leslie Wu, Justin Talbot, Mike Cammara...
Clones are code segments that have been created by copying-and-pasting from other code segments. Clones occur often in large software systems. It is reported that 5 to 50% of the ...
With the explosion of social media, scalability becomes a key challenge. There are two main aspects of the problems that arise: 1) data volume: how to manage and analyze huge data...
Ching-Yung Lin, Jimeng Sun, Nan Cao, Shixia Liu, S...
PageRank computes the importance of each node in a directed graph under a random surfer model governed by a teleportation parameter. Commonly denoted alpha, this parameter models ...
David F. Gleich, Paul G. Constantine, Abraham D. F...
We study a novel clustering problem in which the pairwise relations between objects are categorical. This problem can be viewed as clustering the vertices of a graph whose edges a...
Francesco Bonchi, Aristides Gionis, Francesco Gull...