Sciweavers

244 search results - page 41 / 49
» Improving Web Data Annotations with Spreading Activation
Sort
View
NAACL
2010
13 years 6 months ago
Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment
The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...
Jason R. Smith, Chris Quirk, Kristina Toutanova
EDBT
2009
ACM
123views Database» more  EDBT 2009»
14 years 3 months ago
High-performance information extraction with AliBaba
A wealth of information is available only in web pages, patents, publications etc. Extracting information from such sources is challenging, both due to the typically complex langu...
Peter Palaga, Long Nguyen, Ulf Leser, Jörg Ha...
WWW
2010
ACM
14 years 3 months ago
Large-scale bot detection for search engines
In this paper, we propose a semi-supervised learning approach for classifying program (bot) generated web search traffic from that of genuine human users. The work is motivated by...
Hongwen Kang, Kuansan Wang, David Soukal, Fritz Be...
ESWS
2010
Springer
13 years 7 months ago
The Semantic Gap of Formalized Meaning
Recent work in Ontology learning and Text mining has mainly focused on engineering methods to solve practical problem. In this thesis, we investigate methods that can substantially...
Sebastian Hellmann
AAAI
2008
13 years 10 months ago
A Utility-Theoretic Approach to Privacy and Personalization
Online services such as web search, news portals, and ecommerce applications face the challenge of providing highquality experiences to a large, heterogeneous user base. Recent ef...
Andreas Krause, Eric Horvitz