Sciweavers

315 search results - page 22 / 63
» Text classification from positive and unlabeled documents
Sort
View
ICDAR
2005
IEEE
14 years 2 months ago
Text/Graphic labelling of Ancient Printed Documents
This paper presents a text/graphic labelling for ancient printed documents. Our approach is based on the extraction and the quantification of the various orientations that are pre...
Nicholas Journet, Véronique Eglin, Jean-Yve...
SIGIR
2000
ACM
14 years 29 days ago
Document centered approach to text normalization
In this paper we present an approach to tackle three important problems of text normalization: sentence boundary disambiguation, disambiguation of capitalized words when they are ...
Andrei Mikheev
IPM
2006
130views more  IPM 2006»
13 years 8 months ago
Exploiting structural information for semi-structured document categorization
This paper examines several different approaches to exploiting structural information in semi-structured document categorization. The methods under consideration are designed for ...
Andrej Bratko, Bogdan Filipic
KDD
2007
ACM
139views Data Mining» more  KDD 2007»
14 years 9 months ago
Raising the baseline for high-precision text classifiers
Many important application areas of text classifiers demand high precision and it is common to compare prospective solutions to the performance of Naive Bayes. This baseline is us...
Aleksander Kolcz, Wen-tau Yih
ICDAR
2011
IEEE
12 years 8 months ago
Identification of Indic Scripts on Torn-Documents
—Questioned Document Examination processes often encompass analysis of torn documents. To aid a forensic expert, automatic classification of content type in torn documents might ...
Sukalpa Chanda, Katrin Franke, Umapada Pal