Search Sciweavers | Sciweavers

103 search results - page 10 / 21

» Models and Algorithms for Duplicate Document Detection

177

click to vote

DIS
2007
Springer

106views Theoretical Computer Science» more DIS 2007»

Unsupervised Spam Detection Based on String Alienness Measures

16 years 27 days ago

Download www.i.kyushu-u.ac.jp

We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...

Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Mas...

claim paper

Read More »

185

Voted

SIGIR
2006
ACM

133views Information Technology» more SIGIR 2006»

Feature diversity in cluster ensembles for robust document clustering

16 years 20 days ago

Download serpens.salleurl.edu

The performance of document clustering systems depends on employing optimal text representations, which are not only diﬃcult to determine beforehand, but also may vary from one ...

Xavier Sevillano, Germán Cobo, Francesc Al&...

claim paper

Read More »

178

click to vote

ICDM
2007
IEEE

147views Data Mining» more ICDM 2007»

Improving Knowledge Discovery in Document Collections through Combining Text Retrieval and Link Analysis Techniques

15 years 10 months ago

Download www.cedar.buffalo.edu

In this paper, we present Concept Chain Queries (CCQ), a special case of text mining in document collections focusing on detecting links between two topics across text documents. ...

Wei Jin, Rohini K. Srihari, Hung Hay Ho, Xin Wu

claim paper

Read More »

256

Voted

ICAPR
2001
Springer

207views Pattern Recognition» more ICAPR 2001»

Character Extraction from Interfering Background - Analysis of Double-Sided Handwritten Archival Documents

15 years 11 months ago

Download www.comp.nus.edu.sg

The sipping of ink through the pages of certain double-sided handwritten documents after long periods of storage poses a serious problem to human readers or OCR systems. This pape...

Chew Lim Tan, Ruini Cao, Qian Wang, Peiyi Shen

claim paper

Read More »

161

Voted

KDD
2007
ACM

148views Data Mining» more KDD 2007»

Detecting research topics via the correlation between graphs and texts

16 years 7 months ago

Download www.cs.cornell.edu

In this paper we address the problem of detecting topics in large-scale linked document collections. Recently, topic detection has become a very active area of research due to its...

Yookyung Jo, Carl Lagoze, C. Lee Giles

claim paper

Read More »

« Prev « First page 10 / 21 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers