Sciweavers

103 search results - page 11 / 21
» Models and Algorithms for Duplicate Document Detection
Sort
View
WWW
2006
ACM
14 years 8 months ago
Logical structure based semantic relationship extraction from semi-structured documents
Addressed in this paper is the issue of semantic relationship extraction from semi-structured documents. Many research efforts have been made so far on the semantic information ex...
Kuo Zhang, Gang Wu, Juan-Zi Li
IS
2006
13 years 7 months ago
A methodology for clustering XML documents by structure
The processing and management of XML data are popular research issues. However, operations based on the structure of XML data have not received strong attention. These operations ...
Theodore Dalamagas, Tao Cheng, Klaas-Jan Winkel, T...
ICDAR
2003
IEEE
14 years 19 days ago
Word Segmentation of Handwritten Dates in Historical Documents by Combining Semantic A-Priori-Knowledge with Local Features
The recognition of script in historical documents requires suitable techniques in order to identify single words. Segmentation of lines and words is a challenging task because lin...
Markus Feldbach, Klaus D. Tönnies
KDD
2009
ACM
169views Data Mining» more  KDD 2009»
14 years 2 months ago
On burstiness-aware search for document sequences
As the number and size of large timestamped collections (e.g. sequences of digitized newspapers, periodicals, blogs) increase, the problem of efficiently indexing and searching su...
Theodoros Lappas, Benjamin Arai, Manolis Platakis,...
MM
2009
ACM
175views Multimedia» more  MM 2009»
14 years 1 days ago
Near-duplicate video matching with transformation recognition
Nowadays, the issue of near-duplicate video matching has been extensively studied. However, transformation, which is one of the major causes of near-duplicates, has been little di...
Zhipeng Wu, Shuqiang Jiang, Qingming Huang