This paper describes a simple clustering approach to person name disambiguation of retrieved documents. The methods are based on standard IR concepts and do not require any task-s...
Many Web information services utilize techniques of information extraction (IE) to collect important facts from the Web. To create more advanced services, one possible method is t...
This paper proposes a strategy to reduce the amount of hardware involved in the solution of search engine queries. It proposes using a secondary compact cache that keeps minimal i...
— The continuous growth of information sources on the web, together with the corresponding volume of dailyupdated contents, makes the problem of finding news and articles a chal...
Andrea Addis, Giuliano Armano, Francesco Mascia, E...
This work aims to provide a novel, site-specific web page segmentation and section importance detection algorithm, which leverages structural, content, and visual information. The...