In this paper, we describe ideas and related experiments of Tsinghua University IR group in TREC 2004 QA track. In this track, our system consists three components: Question analy...
While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
Abstract. Many projects have investigated the issue of storing XML in traditional database systems and exporting data in traditional databases as XML documents. However, they paid ...
Web search is challenging partly due to the fact that search queries and Web documents use different language styles and vocabularies. This paper provides a quantitative analysis ...
An ad hoc data format is any non-standard, semi-structured data format for which robust data processing tools are not available. In this paper, we present ANNE, a new kind of mark...