Several Web sites deliver a large number of pages, each publishing data about one instance of some real world entity, such as an athlete, a stock quote, a book. Even though it is ...
Lorenzo Blanco, Valter Crescenzi, Paolo Merialdo, ...
Addressed in this paper is the issue of semantic relationship extraction from semi-structured documents. Many research efforts have been made so far on the semantic information ex...
Current-day crawlers retrieve content only from the publicly indexable Web, i.e., the set of Web pages reachable purely by following hypertext links, ignoring search forms and pag...
The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...
We present a graph-theoretic approach to discover storylines from search results. Storylines are windows that offer glimpses into interesting themes latent among the top search re...