Keyword searching is the most common form of document search on the Web. Many Web publishers manually annotate the META tags and titles of their pages with frequently queried phras...
Hung V. Nguyen, P. Velamuru, Deepak Kolippakkam, H...
Parallel web pages are important source of training data for statistical machine translation. In this paper, we present a new approach to sentence alignment on parallel web pages....
: Most today's web sources do not provide suitable interfaces for software programs to interact with them. Many researchers have proposed highly effective techniques to addres...
Paula Montoto, Alberto Pan, Juan Raposo, Jos&eacut...
We present ISENS, a distributed, end-to-end, ontologybased information integration system. In response to a user’s query, our system is capable of retrieving facts from data sou...
During the last years, significant attention has been paid to the problem of building wrappers for extracting data from semistructured web sources. Nevertheless, since web sources...