The Twenty-One project brings together environmental organisations, technology providers and research institutes from several European countries. The main objective of the project...
Wilco G. ter Stal, J.-H. Beijert, G. de Bruin, J. ...
When humans approach the task of text categorization, they interpret the specific wording of the document in the much larger context of their background knowledge and experience. ...
Text is a pervasive information type, and many applications require querying over text sources in addition to structured data. This paper studies the problem of query processing i...
The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identi...
Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas...
A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...
Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...