The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...
Implicitly structured content on the Web such as HTML tables and lists can be extremely valuable for web search, question answering, and information retrieval, as the implicit str...
— The continuous growth of information sources on the web, together with the corresponding volume of dailyupdated contents, makes the problem of finding news and articles a chal...
Andrea Addis, Giuliano Armano, Francesco Mascia, E...
The correct web site text content must be help to the visitors to find what they are looking for. However, the reality is quite different, many times the web page text content is a...
Focused Web browsing activities such as periodically looking up headline news, weather reports, etc., which require only selective fragments of particular Web pages, can be made m...