Search Sciweavers | Sciweavers

332 search results - page 1 / 67

» Document Content Extraction Using Automatically Discovered F...

258

click to vote

ICDAR
2009
IEEE

158views Document Analysis» more ICDAR 2009»

Document Content Extraction Using Automatically Discovered Features

15 years 4 months ago

Download www.cse.lehigh.edu

We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...

Sui-Yu Wang, Henry S. Baird, Chang An

claim paper

Read More »

161

click to vote

KDD
2002
ACM

148views Data Mining» more KDD 2002»

Discovering informative content blocks from Web documents

16 years 7 months ago

Download www.cs.ualberta.ca

In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...

Shian-Hua Lin, Jan-Ming Ho

claim paper

Read More »

174

click to vote

ICDAR
2009
IEEE

168views Document Analysis» more ICDAR 2009»

Scalable Feature Extraction from Noisy Documents

16 years 1 months ago

Download www.cvc.uab.es

We cope with the metadata recognition in layoutoriented documents. We address the problem as a classiﬁcation task and propose a method for automatic extraction of relevant featu...

Loïc Lecerf, Boris Chidlovskii

claim paper

Read More »

206

click to vote

MLDM
2005
Springer

162views Machine Learning» more MLDM 2005»

CorePhrase: Keyphrase Extraction for Document Clustering

16 years 3 days ago

Download pami.uwaterloo.ca

Abstract. The ability to discover the topic of a large set of text documents using relevant keyphrases is usually regarded as a very tedious task if done by hand. Automatic keyphra...

Khaled M. Hammouda, Diego N. Matute, Mohamed S. Ka...

claim paper

Read More »

203

click to vote

SIGIR
2003
ACM

147views Information Technology» more SIGIR 2003»

Text categorization by boosting automatically extracted concepts

15 years 12 months ago

Download www.cs.brown.edu

Term-based representations of documents have found widespread use in information retrieval. However, one of the main shortcomings of such methods is that they largely disregard le...

Lijuan Cai, Thomas Hofmann

claim paper

Read More »

« Prev « First page 1 / 67 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers