The retrieval of similar documents in the Web from a given document is different in many aspects from information retrieval based on queries generated by regular search engine use...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
The massive amount of near-duplicate and duplicate web videos has presented both challenge and opportunity to multimedia computing. On one hand, browsing videos on Internet become...
We study the average number of transitions in Glushkov automata built from random regular expressions. This statistic highly depends on the probabilistic distribution set on the e...
We present several methods for mining knowledge from the query logs of the MSN search engine. Using the query logs, we build a time series for each query word or phrase (e.g., `Th...
Michail Vlachos, Christopher Meek, Zografoula Vage...
With the advent of the Web and the efforts towards a Semantic Web the nature of knowledge engineering has changed drastically. In this position paper we propose four principles fo...