Sciweavers

257 search results - page 11 / 52
» Text extraction from graphical document images using sparse ...
Sort
View
SMC
2010
IEEE
186views Control Systems» more  SMC 2010»
13 years 6 months ago
Semantic enrichment of text representation with wikipedia for text classification
—Text classification is a widely studied topic in the area of machine learning. A number of techniques have been developed to represent and classify text documents. Most of the t...
Hiroki Yamakawa, Jing Peng, Anna Feldman
PAMI
2002
94views more  PAMI 2002»
13 years 7 months ago
Imaged Document Text Retrieval Without OCR
: We propose a method for text retrieval from document images without the use of OCR. Documents are segmented into character objects. Image features, namely the Vertical Traverse D...
Chew Lim Tan, Weihua Huang, Zhaohui Yu, Yi Xu
ICML
2007
IEEE
14 years 8 months ago
Self-taught learning: transfer learning from unlabeled data
We present a new machine learning framework called "self-taught learning" for using unlabeled data in supervised classification tasks. We do not assume that the unlabele...
Rajat Raina, Alexis Battle, Honglak Lee, Benjamin ...
CG
2007
Springer
13 years 7 months ago
Visual text mining using association rules
In many situations, individuals or groups of individuals are faced with the need to examine sets of documents to achieve understanding of their structure and to locate relevant in...
Alneu de Andrade Lopes, Roberto Pinho, Fernando Vi...
DIAL
2004
IEEE
156views Image Analysis» more  DIAL 2004»
13 years 11 months ago
Xed: A New Tool for eXtracting Hidden Structures from Electronic Documents
PDF became a very common format for exchanging printable documents. Further, it can be easily generated from the major documents formats, which make a huge number of PDF documents...
Karim Hadjar, Maurizio Rigamonti, Denis Lalanne, R...