Active data clustering is a novel technique for clustering of proximity data which utilizes principles from sequential experiment design in order to interleave data generation and...
Events and stories can be characterized by a set of descriptive, collocated keywords. Intuitively, documents describing the same event will contain similar sets of keywords, and t...
Motivation: In cluster analysis, the validity of specific solutions, algorithms, and procedures present significant challenges because there is no null hypothesis to test and no &...
Nikhil R. Garge, Grier P. Page, Alan P. Sprague, B...
This paper describes a simple clustering approach to person name disambiguation of retrieved documents. The methods are based on standard IR concepts and do not require any task-s...
This work applies boosted wrapper induction (BWI), a machine learning algorithm for information extraction from semi-structured documents, to the problem of named entity recogniti...