Sciweavers

328 search results - page 23 / 66
» Modeling Documents for Structure Recognition Using Generaliz...
Sort
View
ICDAR
2009
IEEE
14 years 2 months ago
The GERMANA Database
A new handwritten text database, GERMANA, is presented to facilitate empirical comparison of different approaches to text line extraction and off-line handwriting recognition. G...
Daniel Pérez, Lionel Tarazón, Nicol&...
SIGIR
2009
ACM
14 years 1 months ago
Named entity recognition in query
This paper addresses the problem of Named Entity Recognition in Query (NERQ), which involves detection of the named entity in a given query and classification of the named entity...
Jiafeng Guo, Gu Xu, Xueqi Cheng, Hang Li
ICDAR
2003
IEEE
14 years 20 days ago
A Segmentation Method for Bibliographic References by Contextual Tagging of Fields
In this paper, a method based on part-of-speech tagging (PoS) is used for bibliographic reference structure. This method operates on a roughly structured ASCII file, produced by O...
Dominique Besagni, Abdel Belaïd, Nelly Benet
DGO
2006
134views Education» more  DGO 2006»
13 years 8 months ago
Next steps in near-duplicate detection for eRulemaking
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
Hui Yang, Jamie Callan, Stuart W. Shulman
SIGIR
2008
ACM
13 years 7 months ago
Knowledge transformation from word space to document space
In most IR clustering problems, we directly cluster the documents, working in the document space, using cosine similarity between documents as the similarity measure. In many real...
Tao Li, Chris H. Q. Ding, Yi Zhang 0005, Bo Shao