Sciweavers

48 search results - page 5 / 10
» Collection statistics for fast duplicate document detection
Sort
View
SIGIR
2008
ACM
13 years 7 months ago
Local text reuse detection
Text reuse occurs in many different types of documents and for many different reasons. One form of reuse, duplicate or near-duplicate documents, has been a focus of researchers be...
Jangwon Seo, W. Bruce Croft
ICIW
2009
IEEE
13 years 5 months ago
Detecting Ontology Mappings via Descriptive Statistical Methods
Instance-based ontology mapping comprises a collection of theoretical approaches and applications for identifying the implicit semantic similarities between two ontologies on the ...
Konstantin Todorov
IKE
2004
13 years 9 months ago
Analyzing Large Collections of Email
One of the first applications of the Internet was the electronic mailing (e-mail). Along with the evolution of the Internet, e-mail has evolved into a powerful and popular technolo...
Daniel A. Keim, Christian Panse, Jörn Schneid...
TCSV
2002
292views more  TCSV 2002»
13 years 7 months ago
Document image segmentation using wavelet scale-space features
In this paper, an efficient and computationally fast method for segmenting text and graphics part of document images based on textural cues is presented. We assume that the graphic...
Mausumi Acharyya, Malay K. Kundu
KDD
2009
ACM
169views Data Mining» more  KDD 2009»
14 years 2 months ago
On burstiness-aware search for document sequences
As the number and size of large timestamped collections (e.g. sequences of digitized newspapers, periodicals, blogs) increase, the problem of efficiently indexing and searching su...
Theodoros Lappas, Benjamin Arai, Manolis Platakis,...