Search Sciweavers | Sciweavers

2827 search results - page 146 / 566

» Marking Text Documents

111

click to vote

PAKDD
2009
ACM

127views Data Mining» more PAKDD 2009»

Clustering Documents Using a Wikipedia-Based Concept Representation

15 years 9 months ago

Download www.cs.waikato.ac.nz

Abstract. This paper shows how Wikipedia and the semantic knowledge it contains can be exploited for document clustering. We ﬁrst create a concept-based document representation b...

Anna Huang, David N. Milne, Eibe Frank, Ian H. Wit...

claim paper

Read More »

119

click to vote

ICDAR
2007
IEEE

163views Document Analysis» more ICDAR 2007»

Content-level Annotation of Large Collection of Printed Document Images

15 years 9 months ago

Download cvit.iiit.ac.in

A large annotated corpus is critical to the development of robust optical character recognizers (OCRs). However, creation of annotated corpora is a tedious task. It is laborious, ...

Anand Kumar 0002, C. V. Jawahar

claim paper

Read More »

115

click to vote

RIDE
2002
IEEE

147views Document Analysis» more RIDE 2002»

Enhancive Index for Structured Document

15 years 8 months ago

Download www.cis.uni-muenchen.de

Structured documents, especially the XML documents, are made up of a few logical components, such as title, sections, subsections and paragraphs. The components in each structured...

Xiaoling Wang, Ji-Rong Wen, Yisheng Dong, Wenyin L...

claim paper

Read More »

133

click to vote

FLAIRS
2006

134views Artificial Intelligence» more FLAIRS 2006»

Corpus Based Unsupervised Labeling of Documents

15 years 4 months ago

Download www.aaai.org

Text categorization involves mapping of documents to a fixed set of labels. A similar but equally important problem is that of assigning labels to large corpora. With a deluge of ...

Delip Rao, Deepak P, Deepak Khemani

claim paper

Read More »

116

click to vote

KDD
2004
ACM

160views Data Mining» more KDD 2004»

Boosting for Text Classification with Semantic Features

16 years 3 months ago

Download www.aifb.uni-karlsruhe.de

Abstract. Current text classification systems typically use term stems for representing document content. Semantic Web technologies allow the usage of features on a higher semantic...

Stephan Bloehdorn, Andreas Hotho

claim paper

Read More »

« Prev « First page 146 / 566 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers