Previews and overviews of large, heterogeneous information resources help users comprehend the scope of collections and focus on particular subsets of interest. For narrative docu...
A large number of question and answer pairs can be collected from question and answer boards and FAQ pages on the Web. This paper proposes an automatic method of finding the ques...
In this paper, a distributed system storing and retrieving Broadcast News data recorded from the Greek television is presented. These multimodal data are processed in a grid compu...
Dimitrios Dimitriadis, A. Metallinou, Ioannis Kons...
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. Few users wish to retri...
Models of bags of words typically assume topic mixing so that the words in a single bag come from a limited number of topics. We show here that many sets of bag of words exhibit a...