Sciweavers

466 search results - page 38 / 94
» Scalable Feature Extraction from Noisy Documents
Sort
View
CSB
2004
IEEE
136views Bioinformatics» more  CSB 2004»
14 years 1 months ago
AZuRE, a Scalable System for Automated Term Disambiguation of Gene and Protein Names
Researchers, hindered by a lack of standard gene and protein-naming conventions, endure long, sometimes fruitless, literature searches. A system is described which is able to auto...
Raf M. Podowski, John G. Cleary, Nicholas T. Gonch...
WEBI
2005
Springer
14 years 3 months ago
A Semi-Supervised Document Clustering Algorithm Based on EM
Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the cat...
Leonardo Rigutini, Marco Maggini
SIGIR
2011
ACM
13 years 20 days ago
No free lunch: brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity
This work explores the problem of cross-lingual pairwise similarity, where the task is to extract similar pairs of documents across two different languages. Solutions to this pro...
Ferhan Ture, Tamer Elsayed, Jimmy J. Lin
ICDAR
2003
IEEE
14 years 3 months ago
Writer Identification using Innovative Binarised Features of Handwritten Numerals
The objective of this paper is to present a number of features that can be extracted from handwritten digits and used for author verification or identification of a person’s han...
Graham Leedham, Sumit Chachra
ICDAR
2009
IEEE
13 years 7 months ago
Document Image Binarisation Using Markov Field Model
This paper presents a new approach for the binarization of seriously degraded manuscript. We introduce a new technique based on a Markov Random Field (MRF) model of the document. ...
Thibault Lelore, Frédéric Bouchara