Parallel web pages are important source of training data for statistical machine translation. In this paper, we present a new approach to sentence alignment on parallel web pages....
Traditionally, when one wants to learn about a particular topic, one reads a book or a survey paper. With the rapid expansion of the Web, learning in-depth knowledge about a topic...
We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
This paper proposes relationship discovery models using opinions mined from the Web instead of only conventional collocations. Web opinion mining extracts subjective information f...
—Data fusion is the process of integrating multiple sources of information such that their combination yields better results than if the data sources are used individually. This ...