We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
This study explores the benefits of integrating knowledge representations in prior art patent retrieval. Key to the introduced approach is the utilization of human judgment availa...
Erik Graf, Ingo Frommholz, Mounia Lalmas, Keith va...
Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can p...
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyv...
This paper proposes a novel framework for music content indexing and retrieval. The music structure information, i.e., timing, harmony and music region content, is represented by ...
Namunu Chinthaka Maddage, Haizhou Li, Mohan S. Kan...
The query equivalence problem has been studied extensively for set-semantics and, more recently, for bag-set semantics. However, SQL queries often combine set and bag-set semantic...