This paper presents an unsupervised learning approach to building a non-English (Arabic) stemmer. The stemming model is based on statistical machine translation and it uses an Eng...
The prediction of query performance is an interesting and important issue in Information Retrieval (IR). Current predictors involve the use of relevance scores, which are time-con...
We present a system for searching and classifying U.S. patent documents, based on Inquery. Patents are distributed through hundreds of collections, divided up by general area. The...
This paper presents a new approach to identifying concepts expressed in a collection of email messages, and organizing them into an ontology or taxonomy for browsing. It incorpora...
Recent initiatives like the Million Book Project and Google Print Library Project have already archived several million books in digital format, and within a few years a significa...
Xiaoyue Wang, Lexiang Ye, Eamonn J. Keogh, Christi...