Search Sciweavers | Sciweavers

808 search results - page 67 / 162

» Keyword-based document clustering

289

click to vote

ICDAR
2009
IEEE

158views Document Analysis» more ICDAR 2009»

Document Content Extraction Using Automatically Discovered Features

15 years 5 months ago

Download www.cse.lehigh.edu

We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...

Sui-Yu Wang, Henry S. Baird, Chang An

claim paper

Read More »

163

click to vote

ICPR
2008
IEEE

126views Computer Vision» more ICPR 2008»

A robust technique for text extraction in mixed-type binary documents

16 years 1 months ago

Download figment.cse.usf.edu

A crucial preprocessing stage in applications such as OCR is text extraction from mixed-type documents. The present work, in contrast to most until now, successfully faces the pro...

Charalambos Strouthopoulos, Athanasios Nikolaidis

claim paper

Read More »

168

click to vote

INEX
2005
Springer

124views Information Technology» more INEX 2005»

A Flexible Structured-Based Representation for XML Document Mining

16 years 22 days ago

Download hal.inria.fr

This paper reports on the INRIA group’s approach to XML mining while participating in the INEX XML Mining track 2005. We use a ﬂexible representation of XML documents that allo...

Anne-Marie Vercoustre, Mounir Fegas, Saba Gul, Yve...

claim paper

Read More »

191

click to vote

EMNLP
2010

122views Natural Language Processing» more EMNLP 2010»

NLP on Spoken Documents Without ASR

15 years 5 months ago

Download www.cs.jhu.edu

There is considerable interest in interdisciplinary combinations of automatic speech recognition (ASR), machine learning, natural language processing, text classification and info...

Mark Dredze, Aren Jansen, Glen Coppersmith, Ken Wa...

claim paper

Read More »

237

click to vote

VLDB
2007
ACM

93views Database» more VLDB 2007»

Measuring the Structural Similarity of Semistructured Documents Using Entropy

16 years 7 months ago

Download www.vldb.org

We propose a technique for measuring the structural similarity of semistructured documents based on entropy. After extracting the structural information from two documents we use ...

Sven Helmer

claim paper

Read More »

« Prev « First page 67 / 162 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers