Previous attempts at identifying translational equivalents in comparable corpora have dealt with very large `general language' corpora and words. We address this task in a sp...
The title of a document has two roles, to give a compact summary and to lead the reader to read the document. Conventional title generation focuses on finding key expressions from...
This paper describes the results of some experiments exploring statistical methods to infer syntactic categories from a raw corpus in an unsupervised fashion. It shares certain po...
Weak acceptance conditions for automata on infinite words or trees are defined in terms of the set of states that appear in the run. This is in contrast with, more usual, strong c...
Jakub Neumann, Andrzej Szepietowski, Igor Walukiew...
We present in this paper a method for achieving in an integrated way two tasks of topic analysis: segmentation and link detection. This method combines word repetition and the lex...