Structured data including sets, sequences, trees and graphs, pose significant challenges to fundamental aspects of data management such as efficient storage, indexing, and simila...
Xiaohong Wang, Aaron M. Smalter, Jun Huan, Gerald ...
Counting in general, and estimating the cardinality of (multi-) sets in particular, is highly desirable for a large variety of applications, representing a foundational block for ...
Nikos Ntarmos, Peter Triantafillou, Gerhard Weikum
Vector Space Model (VSM) has been at the core of information retrieval for the past decades. VSM considers the documents as vectors in high dimensional space. In such a vector spa...
Data availability, collection and storage have increased dramatically in recent years, raising new technological and algorithmic challenges for database design and data management...
A string similarity join finds similar pairs between two collections of strings. It is an essential operation in many applications, such as data integration and cleaning, and has ...