In this paper, we describe a new approach for mining concept associations from large text collections. The concepts are short sequences of words that occur frequently together acr...
: In this paper, we will propose PC-Filter (PC stands for Partition Comparison), a robust data filter for approximately duplicate record detection in large databases. PC-Filter dis...
Ji Zhang, Tok Wang Ling, Robert M. Bruckner, Han L...
Background: The MEDLINE database contains over 12 million references to scientific literature, ut 3/4 of recent articles including an abstract of the publication. Retrieval of ent...
When comparing inductive logic programming (ILP) and attribute-value learning techniques, there is a trade-off between expressive power and efficiency. Inductive logic programming ...
Hendrik Blockeel, Luc De Raedt, Nico Jacobs, Bart ...
: In this paper, we propose a framework, called XAR-Miner, for mining ARs from XML documents efficiently and effectively. In XAR-Miner, raw XML data are first transformed to either...
Ji Zhang, Tok Wang Ling, Robert M. Bruckner, A. Mi...