Top-k approximate querying on string collections is an important data analysis tool for many applications, and it has been exhaustively studied. However, the scale of the problem ...
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
String-to-string transduction is a central problem in computational linguistics and natural language processing. It occurs in tasks as diverse as name transliteration, spelling co...
Abstract. The Mongue-Elkan method is a general text string comparison method based on an internal character-based similarity measure (e.g. edit distance) combined with a token leve...
Sergio Jimenez, Claudia Becerra, Alexander F. Gelb...
A modular system to recognize handwritten numerical strings is proposed. It uses a segmentation-based recognition approach and a Recognition and Verification strategy. The approach...
Luiz E. Soares de Oliveira, Robert Sabourin, Fl&aa...