Sciweavers

656 search results - page 71 / 132
» Data Mining and the Web: Past, Present and Future
Sort
View
ICDM
2002
IEEE
162views Data Mining» more  ICDM 2002»
14 years 21 days ago
Phrase-based Document Similarity Based on an Index Graph Model
Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...
Khaled M. Hammouda, Mohamed S. Kamel
KDD
2006
ACM
185views Data Mining» more  KDD 2006»
14 years 8 months ago
Understanding Content Reuse on the Web: Static and Dynamic Analyses
Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
Ricardo A. Baeza-Yates, Álvaro R. Pereira J...
AI
2000
Springer
13 years 7 months ago
Towards adaptive Web sites: Conceptual framework and case study
The creation of a complex web site is a thorny problem in user interface design. In this paper we explore the notion of adaptive web sites: sites that semi-automatically improve t...
Mike Perkowitz, Oren Etzioni
KDD
2002
ACM
179views Data Mining» more  KDD 2002»
14 years 8 months ago
Combining clustering and co-training to enhance text classification using unlabelled data
In this paper, we present a new co-training strategy that makes use of unlabelled data. It trains two predictors in parallel, with each predictor labelling the unlabelled data for...
Bhavani Raskutti, Herman L. Ferrá, Adam Kow...
KDD
2005
ACM
161views Data Mining» more  KDD 2005»
14 years 8 months ago
Combining email models for false positive reduction
Machine learning and data mining can be effectively used to model, classify and discover interesting information for a wide variety of data including email. The Email Mining Toolk...
Shlomo Hershkop, Salvatore J. Stolfo