Annotation of digitized pages from historical document collections is very important to research on automatic extraction of text blocks, lines, and handwriting recognition. We hav...
We introduce, analyze and demonstrate a recursive hierarchical generalization of the widely used hidden Markov models, which we name Hierarchical Hidden Markov Models (HHMM). Our m...
This paper presents a method for acquiring synonyms from monolingual comparable text (MCT). MCT denotes a set of monolingual texts whose contents are similar and can be obtained au...
Previous works on information extraction from tables make use of prior knowledge such as a cognition model of tables or lexical knowledge bases for specific domains. However, we ...
We developed a model based on nonparametric Bayesian modeling for automatic discovery of semantic relationships between words taken from a corpus. It is aimed at discovering seman...