The world wide web has a wealth of information that is related to almost any text classification task. This paper presents a method for mining the web to improve text classificati...
CLIR resources, such as dictionaries and parallel corpora, are scarce for special domains. Obtaining comparable corpora automatically for such domains could be an answer to this p...
One of the greatest and most recent challenges for online advertising is the use of adaptive personalization at the same time that the Internet continues to grow as a global market...
In this paper we propose a completely unsupervised method for open-domain entity extraction and clustering over query logs. The underlying hypothesis is that classes defined by mi...
P-Jigsaw is an extension of W3C's Jigsaw Web-server implementing a cache management strategy for replacement and pre-fetching based on association rules mining from the acces...