Search Sciweavers | Sciweavers

28

SIGIR
2003
ACM

98views Information Technology» more SIGIR 2003»

An information-theoretic measure for document similarity

14 years 2 months ago

Recent work has demonstrated that the assessment of pairwise object similarity can be approached in an axiomatic manner using information theory. We extend this concept speciﬁca...

Javed A. Aslam, Meredith Frost

claim paper

Read More »

29

click to vote

HICSS
2006
IEEE

133views Biometrics» more HICSS 2006»

Being Literate with Large Document Collections: Observational Studies and Cost Structure Tradeoffs

14 years 3 months ago

Download cobweb.ecn.purdue.edu

How do people work with large document collections? We studied the effects of different kinds of analysis tools on the behavior of people doing rapid large-volume data assessment,...

Daniel M. Russell, Malcolm Slaney, Yan Qu, Mave Ho...

claim paper

Read More »

44

click to vote

ICDAR
2009
IEEE

158views Document Analysis» more ICDAR 2009»

Document Content Extraction Using Automatically Discovered Features

13 years 6 months ago

Download www.cse.lehigh.edu

We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...

Sui-Yu Wang, Henry S. Baird, Chang An

claim paper

Read More »

21

click to vote

ICPR
2008
IEEE

126views Computer Vision» more ICPR 2008»

A robust technique for text extraction in mixed-type binary documents

14 years 3 months ago

Download figment.cse.usf.edu

A crucial preprocessing stage in applications such as OCR is text extraction from mixed-type documents. The present work, in contrast to most until now, successfully faces the pro...

Charalambos Strouthopoulos, Athanasios Nikolaidis

claim paper

Read More »

29

click to vote

INEX
2005
Springer

124views Information Technology» more INEX 2005»

A Flexible Structured-Based Representation for XML Document Mining

14 years 2 months ago

Download hal.inria.fr

This paper reports on the INRIA group’s approach to XML mining while participating in the INEX XML Mining track 2005. We use a ﬂexible representation of XML documents that allo...

Anne-Marie Vercoustre, Mounir Fegas, Saba Gul, Yve...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers