Sciweavers

32 search results - page 4 / 7
» Near-duplicate detection for web-forums
Sort
View
WWW
2008
ACM
14 years 8 months ago
Social and semantics analysis via non-negative matrix factorization
Social media such as Web forum often have dense interactions between user and content where network models are often appropriate for analysis. Joint non-negative matrix factorizat...
Zhi-Li Wu, Chi-Wa Cheng, Chun-hung Li
CIVR
2007
Springer
273views Image Analysis» more  CIVR 2007»
14 years 1 months ago
Scalable near identical image and shot detection
This paper proposes and compares two novel schemes for near duplicate image and video-shot detection. The first approach is based on global hierarchical colour histograms, using ...
Ondrej Chum, James Philbin, Michael Isard, Andrew ...
DIS
2007
Springer
14 years 1 months ago
Unsupervised Spam Detection Based on String Alienness Measures
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Mas...
KDD
2004
ACM
195views Data Mining» more  KDD 2004»
14 years 8 months ago
Improved robustness of signature-based near-replica detection via lexicon randomization
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
MM
2009
ACM
249views Multimedia» more  MM 2009»
14 years 6 days ago
MyFinder: near-duplicate detection for large image collections
The explosive growth of multimedia data poses serious challenges to data storage, management and search. Efficient near-duplicate detection is one of the required technologies for...
Xin Yang, Qiang Zhu, Kwang-Ting Cheng