Document ranking is well known to be a crucial process in information retrieval (IR). It presents retrieved documents in an order of their estimated degrees of relevance to query. ...
Automatically generated content is ubiquitous in the web: dynamic sites built using the three-tier paradigm are good examples (e.g. commercial sites, blogs and other sites powered...
We present a new, unique and freely available parallel corpus containing European Union (EU) documents of mostly legal nature. It is available in all 20 official EU languages, wit...
Ralf Steinberger, Bruno Pouliquen, Anna Widiger, C...
This paper discusses a new type of semi-supervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes docum...
Peer-to-peer systems are gaining popularity as a means to effectively share huge, massively distributed data collections. In this paper, we consider XML peers, that is, peers that ...