Removing biases in unsupervised learning of sequential patterns

15 years 6 months ago

Download eprints.pascal-network.org

Unsupervised sequence learning is important to many applications. A learner is presented with unlabeled sequential data, and must discover sequential patterns that characterize the data. Popular approaches to such learning include (and often combine) frequency-based approaches and statistical analysis. However, the quality of results is often far from satisfactory. Though most previous investigations seek to address method-speciﬁc limitations, we instead focus on general (methodneutral) limitations in current approaches. This paper takes two key steps towards addressing such general quality-reducing ﬂaws. First, we carry out an in-depth empirical comparison and analysis of popular sequence learning methods in terms of the quality of information produced, for several synthetic and real-world datasets, under controlled settings of noise. We ﬁnd that both frequency-based and statisticsbased approaches (i) suffer from common statistical biases based on the length of the sequences co...

Yoav Horman, Gal A. Kaminka

Real-time Traffic