Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
tion--the art of abstracting key content from one or more information sources--has become an integral part of everyday life. People keep abreast of world affairs by listening to ne...
Search engines crawl and index webpages depending upon their informative content. However, webpages — especially dynamically generated ones — contain items that cannot be clas...
The goal of the paper is to present a novel Chi-square similarity measure and assess its performance through comparison with well-known similarity measures such as Cosine, Dice, a...
Oktay Ibrahimov, Ishwar K. Sethi, Nevenka Dimitrov...
Classification of time series has been attracting great interest over the past decade. Recent empirical evidence has strongly suggested that the simple nearest neighbor algorithm ...