Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This p...
Abstract. We report a surprising, persistent pattern in large sparse social graphs, which we term EigenSpokes. We focus on large Mobile Call graphs, spanning about 186K nodes and m...
B. Aditya Prakash, Ashwin Sridharan, Mukund Seshad...
Identifying similar keywords, known as broad matches, is an important task in online advertising that has become a standard feature on all major keyword advertising platforms. Eff...
We describe the architecture of a hypertext resource discovery system using a relational database. Such a system can answer questions that combine page contents, metadata, and hyp...
Soumen Chakrabarti, Martin van den Berg, Byron Dom
Statistical machine learning techniques for data classification usually assume that all entities are i.i.d. (independent and identically distributed). However, real-world entities...