Model-based declarative queries are becoming an attractive paradigm for interacting with many data stream applications. This has led to the development of techniques to accurately answer the queries using distributional models rather than raw values. The quintessential problem with this is that of detecting when there is a change in the input stream, which makes models stale and inaccurate. We adopt the sound statistical method of sequential hypothesis testing to study the problem of change detection on streams, without independence assumption. It yields algorithms that are fast, space-efficient, and are oblivious to data's underlying distributions. Our thorough study on synthetic and real data sets demonstrates the effectiveness of our methods to not only determine the existence of a change, but also the point where the change is initiated, relative to the ground truth we obtain. Our methods work seamlessly without window limitations inherent in prior work, thus have clearly sho...
S. Muthukrishnan, Eric van den Berg, Yihua Wu