We compare different strategies to apply statistical machine translation techniques in order to retrieve documents which are a plausible translation of a given source document. Fi...
Abstract— We present a novel approach to automatically extracting summary excerpts from audio and video. Our approach is to maximize the average similarity between the excerpt an...
Background: A new algorithm for assessing similarity between primer and template has been developed based on the hypothesis that annealing of primer to template is an information ...
Machine-generated documents containing semi-structured text are rapidly forming the bulk of data being stored in an organisation. Given a feature-based representation of such data,...
The huge amount of data available from Internet information sources has focused much attention on the sharing of distributed information through Peer Data Management Systems (PDMS...