

Measuring similarity between collection of values

14 years 8 months ago
Measuring similarity between collection of values
In this paper, we propose a set of similarity metrics for manipulating collections of values occuring in XML documents. Following the data model presented in TAX algebra, we treat an XML element as a labeled ordered rooted tree. Consider that XML nodes can be either atomic, i.e, they may contain single values such as short character strings, date, etc, or complex, i.e., nested structures that contain other nodes, we propose two types of similarity metrics: MAVs, for atomic nodes and MCVs, for complex nodes. In the first case, we suggest the use of several application domain dependent metrics. In the second case, we define metrics for complex values that are structure dependent, and can be distinctly applied for tuples and collections of values. We also present experiments showing the effectiveness of our method. Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous General Terms Experimentation, Measurement Keywords Similarity functions, Vague que...
Carina F. Dorneles, Carlos A. Heuser, Andrei E. N.
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Where WIDM
Authors Carina F. Dorneles, Carlos A. Heuser, Andrei E. N. Lima, Altigran Soares da Silva, Edleno Silva de Moura
Comments (0)