Data mining applications analyze large collections of set data and high dimensional categorical data. Search on these data types is not restricted to the classic problems of minin...
WHIRL is an extensionof relational databasesthat canperform "soft joins" basedon the similarity of textual identifiers;thesesoftjoins extendthe traditional operationof j...
In order to find all occurrences of a tree/twig pattern in an XML database, a number of holistic twig join algorithms have been proposed. However, most of these algorithms focus o...
Label stream partition is a useful technique to reduce the input I/O cost of holistic twig join by pruning useless streams beforehand. The Prefix Path Stream (PPS) partition scheme...
—Fuzzy/similarity joins have been widely studied in the research community and extensively used in real-world applications. This paper proposes and evaluates several algorithms f...
Foto N. Afrati, Anish Das Sarma, David Menestrina,...