Sciweavers

DAS
2010
Springer
13 years 5 months ago
Information extraction by finding repeated structure
Repetition of layout structure is prevalent in document images. In document design, such repetition conveys the underlying logical and functional structure of the data. For exampl...
Evgeniy Bart, Prateek Sarkar
AUGHUMAN
2010
13 years 5 months ago
On-line document registering and retrieving system for AR annotation overlay
We propose a system that registers and retrieves text documents to annotate them on-line. The user registers a text document captured from a nearly top view and adds virtual annot...
Hideaki Uchiyama, Julien Pilet, Hideo Saito
AND
2010
13 years 5 months ago
Document: a useful level for facing noisy data
In this paper we will present a set of experiments using large digitalized collections of books to show that logical structures can be extracted with good quality when working at ...
Hervé Déjean, Jean-Luc Meunier
AND
2010
13 years 5 months ago
A platform for storing, visualizing, and interpreting collections of noisy documents
The goal of document image analysis is to produce interpretations that match those of a uent and knowledgeable human when viewing the same input. Because computer vision technique...
Bart Lamiroy, Daniel P. Lopresti
SPIRE
2010
Springer
13 years 6 months ago
Hypergeometric Language Model and Zipf-Like Scoring Function for Web Document Similarity Retrieval
The retrieval of similar documents in the Web from a given document is different in many aspects from information retrieval based on queries generated by regular search engine use...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
SMC
2010
IEEE
186views Control Systems» more  SMC 2010»
13 years 6 months ago
Semantic enrichment of text representation with wikipedia for text classification
—Text classification is a widely studied topic in the area of machine learning. A number of techniques have been developed to represent and classify text documents. Most of the t...
Hiroki Yamakawa, Jing Peng, Anna Feldman
SCIENTOMETRICS
2010
126views more  SCIENTOMETRICS 2010»
13 years 6 months ago
The 12th International conference on scientometrics and informetrics
This paper presents an approach for identifying similar documents that can be used to assist scientists in finding related work. The approach called Citation Proximity Analysis (C...
Jacqueline Leta, Birger Larsen, Ronald Rousseau, W...
PVLDB
2010
135views more  PVLDB 2010»
13 years 6 months ago
P2PDocTagger: Content management through automated P2P collaborative tagging
As the amount of user generated content grows, personal information management has become a challenging problem. Several information management approaches, such as desktop search,...
Hock Hee Ang, Vivekanand Gopalkrishnan, Wee Keong ...
PAMI
2010
135views more  PAMI 2010»
13 years 6 months ago
A Variational Approach to Degraded Document Enhancement
—The goal of this paper is to correct bleed-through in degraded documents using a variational approach. The variational model is adapted using an estimated background according t...
Reza Farrahi Moghaddam, Mohamed Cheriet
KES
2010
Springer
13 years 6 months ago
DOCODE-Lite: A Meta-Search Engine for Document Similarity Retrieval
The retrieval of similar documents from large scale datasets has been the one of the main concerns in knowledge management environments, such as plagiarism detection, news impact a...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...