Sciweavers

142 search results - page 6 / 29
» Entropy-Based Authorship Search in Large Document Collection...
Sort
View
ICDAR
2007
IEEE
14 years 1 months ago
Curvelets Based Queries for CBIR Application in Handwriting Collections
This paper presents a new use of the Curvelet transform as a multiscale method for indexing linear singularities and curved handwritten shapes in documents images. As it belongs t...
Guillaume Joutel, Véronique Eglin, St&eacut...
APWEB
2006
Springer
13 years 11 months ago
The Case of the Duplicate Documents Measurement, Search, and Science
Many of the documents in large text collections are duplicates and versions of each other. In recent research, we developed new methods for finding such duplicates; however, as the...
Justin Zobel, Yaniv Bernstein
ICDAR
2005
IEEE
14 years 1 months ago
A Segmentation-free Approach for Keyword Search in Historical Typewritten Documents
In this paper, we propose a novel segmentation-free approach for keyword search in historical typewritten documents combining image preprocessing, synthetic data creation, word sp...
Basilios Gatos, Thomas Konidaris, Kostas Ntzios, I...
CIKM
2010
Springer
13 years 4 months ago
Crawling the web for structured documents
Structured Information Retrieval is gaining a lot of interest in recent years, as this kind of information is becoming an invaluable asset for professional communities such as Sof...
Julián Urbano, Juan Loréns, Yorgos A...
ERCIMDL
2003
Springer
91views Education» more  ERCIMDL 2003»
14 years 25 days ago
Search Engine-Crawler Symbiosis: Adapting to Community Interests
Web crawlers have been used for nearly a decade as a search engine component to create and update large collections of documents. Typically the crawler and the rest of the search e...
Gautam Pant, Shannon Bradshaw, Filippo Menczer