document | Sciweavers

198

AND
2009

184views Machine Learning» more AND 2009»

Accessing the content of Greek historical documents

15 years 4 months ago

In this paper, we propose an alternative method for accessing the content of Greek historical documents printed during the 17th and 18th centuries by searching words directly in d...

Anastasios L. Kesidis, Eleni Galiotou, Basilios Ga...

claim paper

Read More »

174

click to vote

AND
2009

117views Machine Learning» more AND 2009»

Tools for monitoring, visualizing, and refining collections of noisy documents

15 years 4 months ago

Download www.ecse.rpi.edu

Developing better systems for document image analysis requires understanding errors, their sources, and their effects. The interactions between various processing steps are comple...

Daniel P. Lopresti, George Nagy

claim paper

Read More »

200

click to vote

AND
2009

137views Machine Learning» more AND 2009»

A comprehensive evaluation methodology for noisy historical document recognition techniques

15 years 4 months ago

Download users.iit.demokritos.gr

In this paper, we propose a new comprehensive methodology in order to evaluate the performance of noisy historical document recognition techniques. We aim to evaluate not only the...

Nikolaos Stamatopoulos, Georgios Louloudis, Basili...

claim paper

Read More »

193

click to vote

AINA
2009
IEEE

118views Computer Networks» more AINA 2009»

Document-Oriented Pruning of the Inverted Index in Information Retrieval Systems

15 years 4 months ago

Download www.cs.ucl.ac.uk

Searching very large collections can be costly in both computation and storage. To reduce this cost, recent research has focused on reducing the size (pruning) of the inverted ind...

Lei Zheng, Ingemar J. Cox

claim paper

Read More »

173

click to vote

ACL
2009

91views Computational Linguistics» more ACL 2009»

A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections

15 years 4 months ago

Download ilps.science.uva.nl

User generated content is characterized by short, noisy documents, with many spelling errors and unexpected language usage. To bridge the vocabulary gap between the user's in...

Wouter Weerkamp, Krisztian Balog, Maarten de Rijke

claim paper

Read More »

164

click to vote

SOFSEM
2010
Springer

184views Theoretical Computer Science» more SOFSEM 2010»

Approximate Structural Consistency

15 years 4 months ago

Download www.lri.fr

Abstract. We consider documents as words and trees on some alphabet and study how to compare them with some regular schemas on an alphabet . Given an input document I, we decide ...

Michel de Rougemont, Adrien Vieilleribière

claim paper

Read More »

186

click to vote

KDD
2010
ACM

326views Data Mining» more KDD 2010»

Document clustering via dirichlet process mixture model with feature selection

15 years 4 months ago

Download math.nankai.edu.cn

One essential issue of document clustering is to estimate the appropriate number of clusters for a document collection to which documents should be partitioned. In this paper, we ...

Guan Yu, Ruizhang Huang, Zhaojun Wang

claim paper

Read More »

164

click to vote

IRFC
2010
Springer

182views Information Technology» more IRFC 2010»

An Information Retrieval Model Based on Discrete Fourier Transform

15 years 4 months ago

Download www.lix.polytechnique.fr

Abstract. Information Retrieval (IR) systems combine a variety of techniques stemming from logical, vector-space and probabilistic models. This variety of combinations has produced...

Alberto Costa, Massimo Melucci

claim paper

Read More »

175

click to vote

ICPR
2010
IEEE

189views Computer Vision» more ICPR 2010»

Learning Image Anchor Templates for Document Classification and Data Extraction

15 years 4 months ago

Download www2.parc.com

Image anchor templates are used in document image analysis for document classification, data localization, and other tasks. Current tools allow human operators to mark out small s...

Prateek Sarkar

claim paper

Read More »

186

click to vote

ICML
2010
IEEE

207views Machine Learning» more ICML 2010»

Learning optimally diverse rankings over large document collections

15 years 4 months ago

Download www.icml2010.org

Most learning to rank research has assumed that the utility of different documents is independent, which results in learned ranking functions that return redundant results. The fe...

Aleksandrs Slivkins, Filip Radlinski, Sreenivas Go...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers