In this paper, we propose a framework, called XAR-Miner, for mining ARs from XML documents efficiently. In XAR-Miner, raw data in the XML document are first preprocessed to transf...
MineSetTM , Silicon Graphics’ interactive system for data mining, integrates three powerful technologies: database access, analytical data mining, and data visualization. It sup...
This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated la...
Victor S. Sheng, Foster J. Provost, Panagiotis G. ...
Apriori Stochastic Dependency Detection (ASDD) is an algorithm for fast induction of stochastic logic rules from a database of observations made by an agent situated in an environm...
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...