Abstract. Online advertising has been suffering serious click fraud problem. Fraudulent publishers can generate false clicks using malicious scripts embedded in their web pages. Ev...
Yanlin Peng, Linfeng Zhang, J. Morris Chang, Yong ...
When automatically extracting information from the world wide web, most established methods focus on spotting single HTMLdocuments. However, the problem of spotting complete web s...
Martin Ester, Hans-Peter Kriegel, Matthias Schuber...
: In this paper, we introduce the concept of -orthogonal patterns to mine a representative set of graph patterns. Intuitively, two graph patterns are -orthogonal if their similarit...
Vineet Chaoji, Mohammad Al Hasan, Saeed Salem, J&e...
Document classification presents difficult challenges due to the sparsity and the high dimensionality of text data, and to the complex semantics of the natural language. The tradi...
The problem of record linkage focuses on determining whether two object descriptions refer to the same underlying entity. Addressing this problem effectively has many practical ap...