: In this paper, we will propose PC-Filter (PC stands for Partition Comparison), a robust data filter for approximately duplicate record detection in large databases. PC-Filter dis...
Ji Zhang, Tok Wang Ling, Robert M. Bruckner, Han L...
Block-wise access to data is a central theme in the design of efficient external memory (EM) algorithms. A second important issue, when more than one disk is present, is fully par...
Frank K. H. A. Dehne, David A. Hutchinson, Anil Ma...
Background: The hierarchical clustering tree (HCT) with a dendrogram [1] and the singular value decomposition (SVD) with a dimension-reduced representative map [2] are popular met...
The similarity join is an important database primitive which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. The s...
This paper describes the behavior observed in a class of cellular automata that we have defined as "dissipative", i.e., cellular automata for which the external environm...