Sciweavers

KDD
2004
ACM
196views Data Mining» more  KDD 2004»
14 years 9 months ago
Adversarial classification
Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detectio...
Nilesh N. Dalvi, Pedro Domingos, Mausam, Sumit K. ...
KDD
2004
ACM
117views Data Mining» more  KDD 2004»
14 years 9 months ago
Predicting customer shopping lists from point-of-sale purchase data
This paper describes a prototype that predicts the shopping lists for customers in a retail store. The shopping list prediction is one aspect of a larger system we have developed ...
Chad M. Cumby, Andrew E. Fano, Rayid Ghani, Marko ...
KDD
2004
ACM
163views Data Mining» more  KDD 2004»
14 years 9 months ago
Exploiting dictionaries in named entity extraction: combining semi-Markov extraction processes and data integration methods
We consider the problem of improving named entity recognition (NER) systems by using external dictionaries--more specifically, the problem of extending state-of-the-art NER system...
William W. Cohen, Sunita Sarawagi
KDD
2004
ACM
136views Data Mining» more  KDD 2004»
14 years 9 months ago
Exploring the community structure of newsgroups
d Abstract] Christian Borgs Jennifer Chayes Mohammad Mahdian Amin Saberi We propose to use the community structure of Usenet for organizing and retrieving the information stored i...
Christian Borgs, Jennifer T. Chayes, Mohammad Mahd...
KDD
2004
ACM
117views Data Mining» more  KDD 2004»
14 years 9 months ago
Systematic data selection to mine concept-drifting data streams
One major problem of existing methods to mine data streams is that it makes ad hoc choices to combine most recent data with some amount of old data to search the new hypothesis. T...
Wei Fan
KDD
2004
ACM
181views Data Mining» more  KDD 2004»
14 years 9 months ago
Column-generation boosting methods for mixture of kernels
We devise a boosting approach to classification and regression based on column generation using a mixture of kernels. Traditional kernel methods construct models based on a single...
Jinbo Bi, Tong Zhang, Kristin P. Bennett
KDD
2004
ACM
156views Data Mining» more  KDD 2004»
14 years 9 months ago
TiVo: making show recommendations using a distributed collaborative filtering architecture
We describe the TiVo television show collaborative recommendation system which has been fielded in over one million TiVo clients for four years. Over this install base, TiVo curre...
Kamal Ali, Wijnand van Stam
KDD
2004
ACM
114views Data Mining» more  KDD 2004»
14 years 9 months ago
Mining reference tables for automatic text segmentation
Automatically segmenting unstructured text strings into structured records is necessary for importing the information contained in legacy sources and text collections into a data ...
Eugene Agichtein, Venkatesh Ganti
KDD
2004
ACM
135views Data Mining» more  KDD 2004»
14 years 9 months ago
On demand classification of data streams
Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Phil...