Data Mining | Sciweavers

100

KDD
2004
ACM

113views Data Mining» more KDD 2004»

Learning spatially variant dissimilarity (SVaD) measures

16 years 2 months ago

Clustering algorithms typically operate on a feature vector representation of the data and find clusters that are compact with respect to an assumed (dis)similarity measure betwee...

Krishna Kummamuru, Raghu Krishnapuram, Rakesh Agra...

claim paper

Read More »

123

click to vote

KDD
2004
ACM

330views Data Mining» more KDD 2004»

Learning to detect malicious executables in the wild

16 years 2 months ago

Download www.cs.georgetown.edu

In this paper, we describe the development of a fielded application for detecting malicious executables in the wild. We gathered 1971 benign and 1651 malicious executables and enc...

Jeremy Z. Kolter, Marcus A. Maloof

claim paper

Read More »

115

click to vote

KDD
2004
ACM

195views Data Mining» more KDD 2004»

Improved robustness of signature-based near-replica detection via lexicon randomization

16 years 2 months ago

Download ir.iit.edu

Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...

Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...

claim paper

Read More »

99

click to vote

KDD
2004
ACM

211views Data Mining» more KDD 2004»

Towards parameter-free data mining

16 years 2 months ago

Download www.cs.ucr.edu

Most data mining algorithms require the setting of many input parameters. Two main dangers of working with parameter-laden algorithms are the following. First, incorrect settings ...

Eamonn J. Keogh, Stefano Lonardi, Chotirat (Ann) R...

claim paper

Read More »

131

click to vote

KDD
2004
ACM

137views Data Mining» more KDD 2004»

When do data mining results violate privacy?

16 years 2 months ago

Download www.utdallas.edu

Privacy-preserving data mining has concentrated on obtaining valid results when the input data is private. An extreme example is Secure Multiparty Computation-based methods, where...

Murat Kantarcioglu, Jiashun Jin, Chris Clifton

claim paper

Read More »

123

click to vote

KDD
2004
ACM

210views Data Mining» more KDD 2004»

Web usage mining based on probabilistic latent semantic analysis

16 years 2 months ago

Download maya.cs.depaul.edu

The primary goal of Web usage mining is the discovery of patterns in the navigational behavior of Web users. Standard approaches, such as clustering of user sessions and discoveri...

Xin Jin, Yanzan Zhou, Bamshad Mobasher

claim paper

Read More »

91

click to vote

KDD
2004
ACM

145views Data Mining» more KDD 2004»

Mining coherent gene clusters from gene-sample-time microarray data

16 years 2 months ago

Download www.cse.buffalo.edu

Extensive studies have shown that mining microarray data sets is important in bioinformatics research and biomedical applications. In this paper, we explore a novel type of genesa...

Daxin Jiang, Jian Pei, Murali Ramanathan, Chun Tan...

claim paper

Read More »

115

click to vote

KDD
2004
ACM

170views Data Mining» more KDD 2004»

Why collective inference improves relational classification

16 years 2 months ago

Download kdl.cs.umass.edu

Procedures for collective inference make simultaneous statistical judgments about the same variables for a set of related data instances. For example, collective inference could b...

David Jensen, Jennifer Neville, Brian Gallagher

claim paper

Read More »

78

Voted

KDD
2004
ACM

114views Data Mining» more KDD 2004»

Mining the space of graph properties

16 years 2 months ago

Download www-cs-students.stanford.edu

Existing data mining algorithms on graphs look for nodes satisfying specific properties, such as specific notions of structural similarity or specific measures of link-based impor...

Glen Jeh, Jennifer Widom

claim paper

Read More »

103

click to vote

KDD
2004
ACM

148views Data Mining» more KDD 2004»

Interestingness of frequent itemsets using Bayesian networks as background knowledge

16 years 2 months ago

Download www.cs.umb.edu

The paper presents a method for pruning frequent itemsets based on background knowledge represented by a Bayesian network. The interestingness of an itemset is defined as the abso...

Szymon Jaroszewicz, Dan A. Simovici

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers