The world wide web has a wealth of information that is related to almost any text classification task. This paper presents a method for mining the web to improve text classificati...
Background: The most common substitution matrices currently used (BLOSUM and PAM) are based on protein sequences with average amino acid distributions, thus they do not represent ...
With the increasing importance of search in guiding today's web traffic, more and more effort has been spent to create search engine spam. Since link analysis is one of the m...
We present new combinatorial approximation algorithms for k-set cover. Previous approaches are based on extending the greedy algorithm by efficiently handling small sets. The new a...
Stavros Athanassopoulos, Ioannis Caragiannis, Chri...
This paper investigates the new problem of automatic sense induction for instance names using automatically extracted attribute sets. Several clustering strategies and data source...
Ricardo Martin-Brualla, Enrique Alfonseca, Marius ...