List question answering (QA) offers a unique challenge in effectively and efficiently locating a complete set of distinct answers from huge corpora or the Web. In TREC-12, the med...
Abstract We introduce OCELOT, a prototype system for automatically generating the “gist” of a web page by summarizing it. Although most text summarization research to date has ...
We are developing a cross-media information retrieval system, in which users can view specific segments of lecture videos by submitting text queries. To produce a text index, the ...
In automated text categorization, given a small number of labeled documents, it is very challenging, if not impossible, to build a reliable classifier that is able to achieve high...
Zenglin Xu, Rong Jin, Kaizhu Huang, Michael R. Lyu...
This paper presents an extensive study about the evolution of textual content on the Web, which shows how some new pages are created from scratch while others are created using al...