This paper describes the architecture of a Bulgarian–Bulgarian question answering system — BulQA. The system relies on a partially parsed corpus for answer extraction. The que...
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Enterprise search is challenging for several reasons, notably the dynamic terminology and jargon that are specific to the enterprise domain. This challenge is partly addressed by...
Abstract We present a fast compression and decompression scheme for natural language texts that allows e cient and exible string matching by searching the compressed text directly....
Edleno Silva de Moura, Gonzalo Navarro, Nivio Zivi...
Automated detection of the first document reporting each new event in temporally-sequenced streams of documents is an open challenge. In this paper we propose a new approach which...
Yiming Yang, Jian Zhang, Jaime G. Carbonell, Chun ...