The problem of dividing a sequence of values into segments occurs in database systems, information retrieval, and knowledge management. The challenge is to select a finite number of boundaries for the segments so as to optimize an objective error function defined over those segments. Although this optimization problem can be solved in polynomial time, the algorithm which achieves the minimum error does not scale well, hence it is not practical for applications with massive data sets. There is considerable research with numerous approximation and heuristic algorithms. Still, none of those approaches has resolved the quality-efficiency tradeoff in a satisfactory manner. In (Halim, Karras, and Yap 2009), we obtain near linear time algorithms which achieve both the desired scalability and near-optimal quality, thus dominating earlier approaches. In this paper, we show how two ideas from artificial intelligence, an efficient local search and recombination of multiple solutions reminiscent ...
Felix Halim, Panagiotis Karras, Roland H. C. Yap