Sciweavers

ALGORITHMICA
2010

Homogeneous String Segmentation using Trees and Weighted Independent Sets

13 years 11 months ago
Homogeneous String Segmentation using Trees and Weighted Independent Sets
We divide a string into k segments, each with only one sort of symbols, so as to minimize the total number of exceptions. Motivations come from machine learning and data mining. For binary strings we develop a linear-time algorithm for any k. Key to efficiency is a special-purpose data structure, called W-tree, which reflects relations between repetition lengths of symbols. For non-binary strings we give a nontrivial dynamic programming algorithm. Our problem is equivalent to finding weighted independent sets with certain size constraints, either in paths (binary case) or special interval graphs (general case). We also show that this problem is FPT in bounded-degree graphs.
Peter Damaschke
Added 08 Dec 2010
Updated 08 Dec 2010
Type Journal
Year 2010
Where ALGORITHMICA
Authors Peter Damaschke
Comments (0)