The Rich News system, that can automatically annotate radio and television news with the aid of resources retrieved from the World Wide Web, is described. Automatic speech recogni...
Mike Dowman, Valentin Tablan, Hamish Cunningham, B...
The paper formulates and investigates the aggregation problem for synthesized mediators of Web services (SWMs). An SWM is a finite-state transducer defined in terms of templates...
This work addresses the challenge of extracting structure in educational and training media based on the type of material that is presented during lectures and training sessions. ...
First generation Web-content encodes information in handwritten (HTML) Web pages. Second generation Web content generates HTML pages on demand, e.g. by filling in templates with c...
Jacco van Ossenbruggen, Joost Geurts, Frank Cornel...
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...