The problem of finding clusters in data is challenging when clusters are of widely differing sizes, densities and shapes, and when the data contains large amounts of noise and out...
The frequent items problem is to process a stream of items and find all items occurring more than a given fraction of the time. It is one of the most heavily studied problems in d...
The burgeoning amount of textual data in distributed sources combined with the obstacles involved in creating and maintaining central repositories motivates the need for effective ...
Shenzhi Li, Christopher D. Janneck, Aditya P. Bela...
Recent advances in linear classification have shown that for applications such as document classification, the training can be extremely efficient. However, most of the existing t...
he recent digitization of more than twenty million books has been led by initiatives from countries wishing to preserve their cultural heritage and by commercial endeavors, such a...
Bing Hu, Thanawin Rakthanmanon, Bilson J. L. Campa...