Abstract--Large high dimension datasets are of growing importance in many fields and it is important to be able to visualize them for understanding the results of data mining appro...
Jong Youl Choi, Seung-Hee Bae, Xiaohong Qiu, Geoff...
Tree edit distance is one of the most frequently used distance measures for comparing trees. When using the tree edit distance, we need to determine the cost of each operation, bu...
We define and solve the problem of "distribution classification", and, in general, "distribution mining". Given n distributions (i.e., clouds) of multi-dimensi...
Yasushi Sakurai, Rosalynn Chong, Lei Li, Christos ...
Background: Micro- and macroarray technologies help acquire thousands of gene expression patterns covering important biological processes during plant ontogeny. Particularly, fait...
Determining similarity is a fundamental task in querying multimedia databases in a content-based way. For this challenging task, there exist numerous similarity models which measu...