Given the continuous growth of databases and the abundance of diverse files in modern IT environments, there is a pressing need to integrate keyword search on heterogeneous inform...
We introduce a new dissimilarity function for ranked lists, the expected weighted Hoeffding distance, that has several advantages over current dissimilarity measures for ranked s...
Similarity analysis of source code is helpful during development to provide, for instance, better support for code reuse. Consider a development environment that analyzes code whi...
Tobias Sager, Abraham Bernstein, Martin Pinzger, C...
Text streams are becoming more and more ubiquitous, in the forms of news feeds, weblog archives and so on, which result in a large volume of data. An effective way to explore the...
Xiang Wang 0002, Kai Zhang, Xiaoming Jin, Dou Shen
In many applications, we find large video collections from different genres where the user is often only interested in one or two specific video genres. So, when users are queryin...