Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detectio...
Nilesh N. Dalvi, Pedro Domingos, Mausam, Sumit K. ...
Skew is prevalent in data streams, and should be taken into account by algorithms that analyze the data. The problem of finding "biased quantiles"-- that is, approximate...
Graham Cormode, Flip Korn, S. Muthukrishnan, Dives...
—In many applications, transaction data arrive in the form of high speed data streams. These data contain a lot of information about customers that needs to be carefully managed ...
Weiping Wang 0001, Jianzhong Li, Chunyu Ai, Yingsh...
Whereas schools typically record mounds of data regarding student performance, attendance, and other behaviors over the course of a school year, rarely is that data consulted and ...
Large and complex graphs representing relationships among sets of entities are an increasingly common focus of interest in data analysis--examples include social networks, Web gra...