Distributed and parallel computing environments are becoming cheap and commonplace. The availability of large numbers of CPU's makes it possible to process more data at highe...
We show that the e-commerce domain can provide all the right ingredients for successful data mining. We describe an integrated architecture for supporting this integration. The ar...
Suhail Ansari, Ron Kohavi, Llew Mason, Zijian Zhen...
Crawlers in a knowledge management system need to collect and archive documents from websites, and also track the change status of these documents. However, the existence of URL r...
In this paper, we present a new technique, called Stream Projected Ouliter deTector (SPOT), to deal with outlier detection problem in high-dimensional data streams. SPOT is unique ...
To cope with concept drift, we paired a stable online learner with a reactive one. A stable learner predicts based on all of its experience, whereas a reactive learner predicts ba...