Sciweavers

103 search results - page 12 / 21
» Models and Algorithms for Duplicate Document Detection
Sort
View
PAMI
1998
92views more  PAMI 1998»
13 years 7 months ago
INFORMys: A Flexible Invoice-Like Form-Reader System
—In this paper, we describe a flexible form-reader system capable of extracting textual information from accounting documents, like invoices and bills of service companies. In th...
Francesca Cesarini, Marco Gori, Simone Marinai, Gi...
CIVR
2007
Springer
247views Image Analysis» more  CIVR 2007»
14 years 1 months ago
Near-duplicate keyframe retrieval with visual keywords and semantic context
Near-duplicate keyframes (NDK) play a unique role in large-scale video search, news topic detection and tracking. In this paper, we propose a novel NDK retrieval approach by explo...
Xiao Wu, Wanlei Zhao, Chong-Wah Ngo
KDD
2007
ACM
136views Data Mining» more  KDD 2007»
14 years 7 months ago
Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases
We now have incrementally-grown databases of text documents ranging back for over a decade in areas ranging from personal email, to news-articles and conference proceedings. While...
Benyah Shaparenko, Thorsten Joachims
AIRWEB
2005
Springer
14 years 27 days ago
Blocking Blog Spam with Language Model Disagreement
We present an approach for detecting link spam common in blog comments by comparing the language models used in the blog post, the comment, and pages linked by the comments. In co...
Gilad Mishne, David Carmel, Ronny Lempel
MMM
2009
Springer
151views Multimedia» more  MMM 2009»
14 years 4 months ago
Large Scale Concept Detection in Video Using a Region Thesaurus
This paper presents an approach on high-level feature detection within video documents, using a Region Thesaurus. A video shot is represented by a single keyframe and MPEG-7 featur...
Evaggelos Spyrou, Giorgos Tolias, Yannis S. Avrith...