: A major problem that arises from integrating different databases is the existence of duplicates. Data cleaning is the process for identifying two or more records within the datab...
Abstract— Data synopsis is a lossy compressed representation of data stored into databases that helps the query optimizer to speed up the query process, e.g. time to retrieve the...
The maximum cardinality of a frequent set as well as the minimum cardinality of an infrequent set are important characteristic numbers in frequent (item) set mining. Gunopulos et a...
Let G = (V, E) be a connected multigraph, whose edges are associated with labels specified by an integer-valued function L : E → N. In addition, each label ℓ ∈ N has a non-...
The stretch factor of a Euclidean graph is the maximum ratio of the distance in the graph between any two points and their Euclidean distance. Given a set S of n points in Rd, we ...