For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...
This paper presents a transaction-time HTTP server, called ? Apache that supports document versioning. A document often consists of a main file formatted in HTML or XML and severa...
Organizing large document collections for finding information easily and quickly has always been an important user requirement. This paper describes a flexible and powerful dynami...
Cross-language information retrieval (CLIR) facilitates the use of one language to access documents in other languages. Crosslanguage information extraction (CLIE) extracts releva...
In this poster, we describe an experiment exploring the effectiveness of a pen based text input device for use in query construction. Standard TREC queries were written, recognise...