This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia, a structured multilingual knowledge base, has been highly exploited in many monolingual...
The structural features of XML components are an extra source of information that should be used in a contentoriented retrieval task on this type of documents. This paper explores...
In this paper we present an initial study on the use of both high and low level MPEG-7 descriptions for video retrieval. A brief survey of current XML indexing techniques shows tha...
Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...
We describe research carried out as part of a text summarisation project for the legal domain for which we use a new XML corpus of judgments of the UK House of Lords. These judgmen...