When aligning texts in very different languages such as Korean and English, structural features beyond word or phrase give useful intbrmation. In this paper, we present a method f...
Phrasal segmentation models define a mapping from the words of a sentence to sequences of translatable phrases. We discuss the estimation of these models from large quantities of ...
Computational models of meaning trained on naturally occurring text successfully model human performance on tasks involving simple similarity measures, but they characterize meani...
Marco Baroni, Brian Murphy, Eduard Barbu, Massimo ...
We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple in...
- Research work related to applying text categorization methods to a monolingual corpus such as English text collections has been well established by several research teams in rece...