Sciweavers

2827 search results - page 54 / 566
» Marking Text Documents
Sort
View
SIGIR
2000
ACM
14 years 10 days ago
Document centered approach to text normalization
In this paper we present an approach to tackle three important problems of text normalization: sentence boundary disambiguation, disambiguation of capitalized words when they are ...
Andrei Mikheev
ICDAR
2005
IEEE
14 years 1 months ago
Text Recognition of Low-resolution Document Images
Cheap and versatile cameras make it possible to easily and quickly capture a wide variety of documents. However, low resolution cameras present a challenge to OCR because it is vi...
Charles E. Jacobs, Patrice Y. Simard, Paul A. Viol...
ICDAR
2003
IEEE
14 years 1 months ago
Correcting Document Image Warping Based on Regression of Curved Text Lines
Image warping is a common problem when one scans or photocopies a document page from a thick bound volume, resulting in shading and curved text lines in the spine area of the boun...
Zheng Zhang 0003, Chew Lim Tan
ICDE
2007
IEEE
211views Database» more  ICDE 2007»
14 years 2 months ago
Document Representation and Dimension Reduction for Text Clustering
Increasingly large text datasets and the high dimensionality associated with natural language create a great challenge in text mining. In this research, a systematic study is cond...
M. Mahdi Shafiei, Singer Wang, Roger Zhang, Evange...
ICDM
2009
IEEE
151views Data Mining» more  ICDM 2009»
13 years 5 months ago
TagLearner: A P2P Classifier Learning System from Collaboratively Tagged Text Documents
The amount of text data on the Internet is growing at a very fast rate. Online text repositories for news agencies, digital libraries and other organizations currently store gigaan...
Haimonti Dutta, Xianshu Zhu, Tushar Mahule, Hillol...