In a higher level task such as clustering of web results or word sense disambiguation, knowledge of all possible distinct concepts in which an ambiguous word can be expressed woul...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
An approach to detection of phishing webpages based on visual similarity is proposed, which can be utilized as a part of an enterprise solution for anti-phishing. A legitimate web...
Liu Wenyin, Guanglin Huang, Liu Xiaoyue, Zhang Min...
Theoretical analysis of the Web graph is often used to improve the efficiency of search engines. The PageRank algorithm, proposed by [5], is used by the Google search engine [4] t...
This paper addresses Named Entity Mining (NEM), in which we mine knowledge about named entities such as movies, games, and books from a huge amount of data. NEM is potentially use...