We consider the problem of efficiently sampling Web search engine query results. In turn, using a small random sample instead of the full set of results leads to efficient approxi...
Aris Anagnostopoulos, Andrei Z. Broder, David Carm...
To enable smart environments and self-tuning data centers, we are developing the Aspen system for integrating physical sensor data, as well as stream data coming from machine logi...
Svilen R. Mihaylov, Marie Jacob, Zachary G. Ives, ...
KDD is a complex and demanding task. While a large number of methods has been established for numerous problems, many challenges remain to be solved. New tasks emerge requiring th...
Ingo Mierswa, Michael Wurst, Ralf Klinkenberg, Mar...
Matching regular expressions (regexps) is a very common workload. For example, tokenization, which consists of recognizing words or keywords in a character stream, appears in ever...
We present a case study about the application of the inductive database approach to the analysis of Web logs. We consider rich XML Web logs ? called conceptual logs ? that are gen...
Rosa Meo, Pier Luca Lanzi, Maristella Matera, Robe...