Institutions and companies that are based in countries where the main language is not English typically publish Web sites that offer the same information at least in the local lan...
Filippo Ricca, Paolo Tonella, Emanuele Pianta, Chr...
Parallel corpus is a rich linguistic resource for various multilingual text management tasks, including crosslingual text retrieval, multilingual computational linguistics and mul...
Current practice of Web site development does not address explicitly the problems related to multilingual sites. The same information, as well as the same navigation paths, page f...
Paolo Tonella, Filippo Ricca, Emanuele Pianta, Chr...
EuroGOV is a multilingual web corpus that was created to serve as the document collection for WebCLEF, the CLEF 2005 web retrieval task. EuroGOV is a collection of web pages crawl...
Multilingual text compression exploits the existence of the same text in several languages to compress the second and subsequent copies by reference to the first. We explore the d...