Sciweavers

77 search results - page 6 / 16
» Pairwise Document Similarity in Large Collections with MapRe...
Sort
View
KDD
2009
ACM
191views Data Mining» more  KDD 2009»
14 years 8 months ago
Efficient methods for topic model inference on streaming document collections
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Limin Yao, David M. Mimno, Andrew McCallum
CORR
2006
Springer
100views Education» more  CORR 2006»
13 years 7 months ago
Automatic annotation of multilingual text collections with a conceptual thesaurus
Automatic annotation of documents with controlled vocabulary terms (descriptors) from a conceptual thesaurus is not only useful for document indexing and retrieval. The mapping of...
Bruno Pouliquen, Ralf Steinberger, Camelia Ignat
ICDAR
2007
IEEE
14 years 1 months ago
Curvelets Based Queries for CBIR Application in Handwriting Collections
This paper presents a new use of the Curvelet transform as a multiscale method for indexing linear singularities and curved handwritten shapes in documents images. As it belongs t...
Guillaume Joutel, Véronique Eglin, St&eacut...
ICDAR
2005
IEEE
14 years 1 months ago
Document Ranking by Layout Relevance
This paper describes the development of a new document ranking system based on layout similarity. The user has a need represented by a set of ”wanted” documents, and the syste...
May Huang, Daniel DeMenthon, David S. Doermann, Ly...
CORR
2006
Springer
132views Education» more  CORR 2006»
13 years 7 months ago
Navigating multilingual news collections using automatically extracted information
We are presenting a text analysis tool set that allows analysts in various fields to sieve through large collections of multilingual news items quickly and to find information that...
Ralf Steinberger, Bruno Pouliquen, Camelia Ignat