—Traditionally, HMM-based approaches to online Kanji handwriting recognition have relied on a hand-made dictionary, mapping characters to primitives such as strokes or substrokes. We present an unsupervised way to learn a stroke tagger from data, which we eventually use to automatically generate such a dictionary. In addition to not requiring a prior hand-made dictionary, our approach can improve the recognition accuracy by exploiting unlabeled data when the amount of labeled data is limited. Keywords-kanji; handwriting recognition; HMM; clustering;