Sciweavers

SDM
2007
SIAM
177views Data Mining» more  SDM 2007»
13 years 10 months ago
Bursty Feature Representation for Clustering Text Streams
Text representation plays a crucial role in classical text mining, where the primary focus was on static text. Nevertheless, well-studied static text representations including TFI...
Qi He, Kuiyu Chang, Ee-Peng Lim, Jun Zhang
SDM
2007
SIAM
152views Data Mining» more  SDM 2007»
13 years 10 months ago
HP2PC: Scalable Hierarchically-Distributed Peer-to-Peer Clustering
In distributed data mining models, adopting a flat node distribution model can affect scalability. To address the problem of modularity, flexibility and scalability, we propose...
Khaled M. Hammouda, Mohamed S. Kamel
SDM
2007
SIAM
104views Data Mining» more  SDM 2007»
13 years 10 months ago
Boosting Optimal Logical Patterns Using Noisy Data
We consider the supervised learning of a binary classifier from noisy observations. We use smooth boosting to linearly combine abstaining hypotheses, each of which maps a subcube...
Noam Goldberg, Chung-chieh Shan
SDM
2007
SIAM
109views Data Mining» more  SDM 2007»
13 years 10 months ago
Segmentations with Rearrangements
Sequence segmentation is a central problem in the analysis of sequential and time-series data. In this paper we introduce and we study a novel variation to the segmentation proble...
Aristides Gionis, Evimaria Terzi
SDM
2007
SIAM
140views Data Mining» more  SDM 2007»
13 years 10 months ago
A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions
In recent years, there have been some interesting studies on predictive modeling in data streams. However, most such studies assume relatively balanced and stable data streams but...
Jing Gao, Wei Fan, Jiawei Han, Philip S. Yu
SDM
2007
SIAM
176views Data Mining» more  SDM 2007»
13 years 10 months ago
Adaptive Concept Learning through Clustering and Aggregation of Relational Data
We introduce a new approach for Clustering and Aggregating Relational Data (CARD). We assume that data is available in a relational form, where we only have information about the ...
Hichem Frigui, Cheul Hwang
SDM
2007
SIAM
121views Data Mining» more  SDM 2007»
13 years 10 months ago
Mining Visual and Textual Data for Constructing a Multi-Modal Thesaurus
We propose an unsupervised approach to learn associations between continuous-valued attributes from different modalities. These associations are used to construct a multi-modal t...
Hichem Frigui, Joshua Caudill
SDM
2007
SIAM
96views Data Mining» more  SDM 2007»
13 years 10 months ago
Understanding and Utilizing the Hierarchy of Abnormal BGP Events
Abnormal events, such as security attacks, misconfigurations, or electricity failures, could have severe consequences toward the normal operation of the Border Gateway Protocol (...
Dejing Dou, Jun Li, Han Qin, Shiwoong Kim, Sheng Z...
SDM
2007
SIAM
130views Data Mining» more  SDM 2007»
13 years 10 months ago
Towards Attack-Resilient Geometric Data Perturbation
Data perturbation is a popular technique for privacypreserving data mining. The major challenge of data perturbation is balancing privacy protection and data quality, which are no...
Keke Chen, Gordon Sun, Ling Liu
SDM
2007
SIAM
89views Data Mining» more  SDM 2007»
13 years 10 months ago
Preventing Information Leaks in Email
The widespread use of email has raised serious privacy concerns. A critical issue is how to prevent email information leaks, i.e., when a message is accidentally addressed to non-...
Vitor R. Carvalho, William W. Cohen