Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

194

PAMI
2002

94views more PAMI 2002»

Imaged Document Text Retrieval Without OCR

15 years 6 months ago

Imaged Document Text Retrieval Without OCR

Download www.comp.nus.edu.sg

: We propose a method for text retrieval from document images without the use of OCR. Documents are segmented into character objects. Image features, namely the Vertical Traverse Density (VTD) and Horizontal Traverse Density (HTD), are extracted. An n-gram based document vector is constructed for each document based on these features. Text similarity between documents is then measured by calculating the dot product of the document vectors. Testing with seven corpora of imaged textual documents in English and Chinese as well as images from UW1 database confirms the validity of the proposed method.

Chew Lim Tan, Weihua Huang, Zhaohui Yu, Yi Xu

Real-time Traffic

Document Vectors | Documents | PAMI 2002 | Traverse Density |

claim paper

Related Content

» Representing OCRed documents in HTML

» Keyword Spotting in Document Images through Word Shape Coding

» A General System for the Retrieval of Document Images from Digital Libraries

» A Gamebased Approach to Transcribing Images of Text

» OCR Based Slide Retrieval

» Quality of OCR for Degraded Text Images

» Memorybased recognition of cameracaptured characters

» Picture detection in document page images

» Exploring Digital Libraries with Document Image Retrieval

Post Info
More Details (n/a)

Added	23 Dec 2010
Updated	23 Dec 2010
Type	Journal
Year	2002
Where	PAMI
Authors	Chew Lim Tan, Weihua Huang, Zhaohui Yu, Yi Xu

Comments (0)