Sciweavers

259 search results - page 44 / 52
» Query-free news search
Sort
View
SIGIR
2008
ACM
13 years 9 months ago
SpotSigs: robust and efficient near duplicate detection in large web collections
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
COMPUTER
2000
180views more  COMPUTER 2000»
13 years 9 months ago
The Challenges of Automatic Summarization
tion--the art of abstracting key content from one or more information sources--has become an integral part of everyday life. People keep abreast of world affairs by listening to ne...
Udo Hahn, Inderjeet Mani
SAC
2005
ACM
14 years 3 months ago
Automatic extraction of informative blocks from webpages
Search engines crawl and index webpages depending upon their informative content. However, webpages — especially dynamically generated ones — contain items that cannot be clas...
Sandip Debnath, Prasenjit Mitra, C. Lee Giles
ICPR
2002
IEEE
14 years 11 months ago
The Performance Analysis of a Chi-square Similarity Measure for Topic Related Clustering of Noisy Transcripts
The goal of the paper is to present a novel Chi-square similarity measure and assess its performance through comparison with well-known similarity measures such as Cosine, Dice, a...
Oktay Ibrahimov, Ishwar K. Sethi, Nevenka Dimitrov...
KDD
2009
ACM
192views Data Mining» more  KDD 2009»
14 years 10 months ago
Time series shapelets: a new primitive for data mining
Classification of time series has been attracting great interest over the past decade. Recent empirical evidence has strongly suggested that the simple nearest neighbor algorithm ...
Lexiang Ye, Eamonn J. Keogh