We present a stochastic finite-state model for segmenting Chinese text into dictionary entries and productively derived words, and providing pronunciations for these words; the me...
Richard Sproat, Chilin Shih, William Gale, Nancy C...
Ambiguous person names are a problem in many forms of written text, including that which is found on the Web. In this paper we explore the use of unsupervised clustering techniques...
Identifying the occurrences of proper names in text and the entities they refer to can be a difficult task because of the manyto-many mapping between names and their referents. We...
In this paper, a novel algorithm is presented for writer identification from handwritings. Principal Component Analysis is applied to the gray-scale handwriting images to find a s...