Sciweavers

ICDM
2009
IEEE

Hierarchical Probabilistic Segmentation of Discrete Events

14 years 6 months ago
Hierarchical Probabilistic Segmentation of Discrete Events
—Segmentation, the task of splitting a long sequence of discrete symbols into chunks, can provide important information about the nature of the sequence that is understandable to humans. Algorithms for segmenting mostly belong to the supervised learning family, where a labeled corpus is available to the algorithm in the learning phase. We are interested, however, in the unsupervised scenario, where the algorithm never sees examples of successful segmentation, but still needs to discover meaningful segments. In this paper we present an unsupervised learning algorithm for segmenting sequences of symbols or categorical events. Our algorithm, Hierarchical Multigram, hierarchically builds a lexicon of segments and computes a maximum likelihood segmentation given the current lexicon. Thus, our algorithm is most appropriate to hierarchical sequences, where smaller segments are grouped into larger segments. Our probabilistic approach also allows us to suggest conditional entropy as a measure...
Guy Shani, Christopher Meek, Asela Gunawardana
Added 23 May 2010
Updated 23 May 2010
Type Conference
Year 2009
Where ICDM
Authors Guy Shani, Christopher Meek, Asela Gunawardana
Comments (0)