This paper presents and evaluates several original techniques for the latent classification of biographic attributes such as gender, age and native language, in diverse genres (co...
We have designed, implemented and evaluated an end-to-end system spellchecking and autocorrection system that does not require any manually annotated training data. The World Wide...
Casey Whitelaw, Ben Hutchinson, Grace Chung, Ged E...
In this paper, we propose a comparative study between the affixal approach and the analytical approach for off-line Arabic decomposable word recognition. The analytical approach i...
This paper proposes a chunking strategy to detect unknown words in Chinese word segmentation. First, a raw sentence is pre-segmented into a sequence of word atoms 1 using a maximum...
Word lattice decoding has proven useful in spoken language translation; we argue that it provides a compelling model for translation of text genres, as well. We show that prior wo...