Sciweavers

140 search results - page 11 / 28
» kdd 2004
Sort
View
KDD
2004
ACM
132views Data Mining» more  KDD 2004»
14 years 10 months ago
A probabilistic framework for semi-supervised clustering
Unsupervised clustering can be significantly improved using supervision in the form of pairwise constraints, i.e., pairs of instances labeled as belonging to same or different clu...
Sugato Basu, Mikhail Bilenko, Raymond J. Mooney
KDD
2004
ACM
195views Data Mining» more  KDD 2004»
14 years 10 months ago
Improved robustness of signature-based near-replica detection via lexicon randomization
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
KDD
2004
ACM
330views Data Mining» more  KDD 2004»
14 years 10 months ago
Learning to detect malicious executables in the wild
In this paper, we describe the development of a fielded application for detecting malicious executables in the wild. We gathered 1971 benign and 1651 malicious executables and enc...
Jeremy Z. Kolter, Marcus A. Maloof
KDD
2004
ACM
145views Data Mining» more  KDD 2004»
14 years 3 months ago
A graph-theoretic approach to extract storylines from search results
We present a graph-theoretic approach to discover storylines from search results. Storylines are windows that offer glimpses into interesting themes latent among the top search re...
Ravi Kumar, Uma Mahadevan, D. Sivakumar
KDD
2004
ACM
125views Data Mining» more  KDD 2004»
14 years 10 months ago
Differential Association Rule Mining for the Study of Protein-Protein Interaction Networks
Protein-protein interactions are of great interest to biologists. A variety of high-throughput techniques have been devised, each of which leads to a separate definition of an int...
Christopher Besemann, Anne Denton, Ajay Yekkirala,...