All information exchange on the Internet ? whether through full text, controlled vocabularies, ontologies, or other mechanisms ? ultimately requires that that an information provi...
Automatically segmenting unstructured text strings into structured records is necessary for importing the information contained in legacy sources and text collections into a data ...
We present a tree data structure for fast
nearest neighbor operations in general n-
point metric spaces (where the data set con-
sists of n points). The data structure re-
quir...
Selective sampling is a form of active learning which can reduce the cost of training by only drawing informative data points into the training set. This selected training set is ...
Zhenyu Lu, Anand I. Rughani, Bruce I. Tranmer, Jos...
Search trails mined from browser or toolbar logs comprise queries and the post-query pages that users visit. Implicit endorsements from many trails can be useful for search result...