Sciweavers

620 search results - page 21 / 124
» Computing with words for text processing: An approach to the...
Sort
View
TAL
2004
Springer
14 years 2 months ago
One Size Fits All? A Simple Technique to Perform Several NLP Tasks
Word fragments or n-grams have been widely used to perform different Natural Language Processing tasks such as information retrieval [1] [2], document categorization [3], automatic...
Daniel Gayo-Avello, Darío Álvarez Gu...
ICASSP
2011
IEEE
13 years 16 days ago
Toward text message normalization: Modeling abbreviation generation
This paper describes a text normalization system for deletion-based abbreviations in informal text. We propose using statistical classifiers to learn the probability of deleting ...
Deana Pennell, Yang Liu
SIGIR
2003
ACM
14 years 2 months ago
Domain-independent text segmentation using anisotropic diffusion and dynamic programming
This paper presents a novel domain-independent text segmentation method, which identifies the boundaries of topic changes in long text documents and/or text streams. The method c...
Xiang Ji, Hongyuan Zha
KDD
2008
ACM
120views Data Mining» more  KDD 2008»
14 years 9 months ago
Entity categorization over large document collections
Extracting entities (such as people, movies) from documents and identifying the categories (such as painter, writer) they belong to enable structured querying and data analysis ov...
Arnd Christian König, Rares Vernica, Venkates...
ICDAR
2007
IEEE
14 years 20 days ago
Identification of Latin-Based Languages through Character Stroke Categorization
This paper presents a language identification technique that detects Latin-based languages of imaged documents without OCR. The proposed technique detects languages through the wo...
S. J. Lu, L. Li, Chew Lim Tan