Sciweavers

35 search results - page 3 / 7
» Document centered approach to text normalization
Sort
View
CIMCA
2006
IEEE
13 years 9 months ago
Identification of Document Language is Not yet a Completely Solved Problem
Existing Language Identification (LID) approaches do reach 100% precision, in most common situations, when dealing with documents written in just one language, and when those docu...
Joaquim Ferreira da Silva, Gabriel Pereira Lopes
TREC
2003
13 years 9 months ago
Approaches to Robust and Web Retrieval
: We describe our participation in the TREC 2003 Robust and Web tracks. For the Robust track, we experimented with the impact of stemming and feedback on the worst scoring topics. ...
Jaap Kamps, Christof Monz, Maarten de Rijke, B&oum...
AUSDM
2006
Springer
137views Data Mining» more  AUSDM 2006»
13 years 11 months ago
A Study of Local and Global Thresholding Techniques in Text Categorization
Feature Filtering is an approach that is widely used for dimensionality reduction in text categorization. In this approach feature scoring methods are used to evaluate features le...
Nayer M. Wanas, Dina A. Said, Nevin M. Darwish, Na...
DRR
2003
13 years 9 months ago
Correcting OCR text by association with historical datasets
The Medical Article Records System (MARS) developed by the Lister Hill National Center for Biomedical Communications uses scanning, OCR and automated recognition and reformatting ...
Susan E. Hauser, Jonathan Schlaifer, Tehseen F. Sa...
NLDB
2000
Springer
13 years 11 months ago
Natural Language Analysis for Semantic Document Modeling
To ease the retrieval of documents published on the Web, the documents should be classified in a way that users find helpful and meaningful. This paper presents an approach to sema...
Terje Brasethvik, Jon Atle Gulla