Information retrieval systems are frequently required to handle long queries. Simply using all terms in the query or relying on the underlying retrieval model to appropriately wei...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Definition questions represent a largely unexplored area of question answering--they are different from factoid questions in that the goal is to return as many relevant "nugg...
Abstract. An increasing and overwhelming amount of biomedical information is available in the research literature mainly in the form of free-text. Biologists need tools that automa...
The results of the 2006 ECML/PKDD Discovery Challenge suggest that semi-supervised learning methods work well for spam filtering when the source of available labeled examples diff...