Although information retrieval research has always been concerned with improving the effectiveness of search, in some applications, such as information analysis, a more specific ...
Although text categorization is a burgeoning area of IR research, readily available test collections in this field are surprisingly scarce. We describe a methodology and system (...
Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Index...
In this paper we study the problem of finding most topical named entities among all entities in a document, which we refer to as focused named entity recognition. We show that th...
This paper examines whether the Cranfield evaluation methodology is robust to gross violations of the completeness assumption (i.e., the assumption that all relevant documents wi...