Similarity search usually encounters a serious problem in the high dimensional space, known as the “curse of dimensionality”. In order to speed up the retrieval efficiency, p...
Discovering rare categories and classifying new instances of them is an important data mining issue in many fields, but fully supervised learning of a rare class classifier is pr...
Enterprises often need to assess and manage the risk arising from uncertainty in their data. Such uncertainty is typically modeled as a probability distribution over the uncertain...
Peter J. Haas, Christopher M. Jermaine, Subi Arumu...
Abstract. XML provides a natural mechanism for representing semistructured and unstructured data. It becomes the basis for encoding a large variety of information, for example, the...
The Web is based on a browsing paradigm that makes it di cult to retrieve and integrate data from multiple sites. Today, the only way to do this is to build specialized applicatio...