In this paper we tackle sentence boundary disambiguation through a part-of-speech (POS) tagging framework. We describe necessary changes in text tokenization and the implementatio...
In this paper we present an approach to tackle three important problems of text normalization: sentence boundary disambiguation, disambiguation of capitalized words when they are ...
The units processed by tagging procedures - both automatic and manual - are sentences as occurring in the texts in the corpus, but the human annotators are instructed to assign ...
Word sense disambiguation for unrestricted text is one of the most difficult tasks in the fields of computational linguistics. The crux of the problem is to discover a model that ...