Abstract. Activity inference based on object use has received considerable recent attention. Such inference requires statistical models that map activities to the objects used in p...
This paper describes a method of detecting Japanese Katakana variants from a large corpus. Katakana words, which are mainly used as loanwords, cause problems with information retr...
Although PageRank has been designed to estimate the popularity of Web pages, it is a general algorithm that can be applied to the analysis of other graphs other than one of hypert...
Static index pruning techniques aim at removing from the posting lists of an inverted file the references to documents which are likely to be not relevant for answering user querie...
XML is fast becoming the standard format to store, exchange and publish over the web, and is getting embedded in applications. Two challenges in handling XML are its size (the XML...
Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini...