Sciweavers

371 search results - page 26 / 75
» Learning to match and cluster large high-dimensional data se...
Sort
View
ICML
2005
IEEE
14 years 8 months ago
Robust one-class clustering using hybrid global and local search
Unsupervised learning methods often involve summarizing the data using a small number of parameters. In certain domains, only a small subset of the available data is relevant for ...
Gunjan Gupta, Joydeep Ghosh
DATAMINE
2002
125views more  DATAMINE 2002»
13 years 7 months ago
High-Performance Commercial Data Mining: A Multistrategy Machine Learning Application
We present an application of inductive concept learning and interactive visualization techniques to a large-scale commercial data mining project. This paper focuses on design and c...
William H. Hsu, Michael Welge, Thomas Redman, Davi...
WSDM
2012
ACM
329views Data Mining» more  WSDM 2012»
12 years 3 months ago
Beyond 100 million entities: large-scale blocking-based resolution for heterogeneous data
A prerequisite for leveraging the vast amount of data available on the Web is Entity Resolution, i.e., the process of identifying and linking data that describe the same real-worl...
George Papadakis, Ekaterini Ioannou, Claudia Niede...
ACL
2010
13 years 5 months ago
Learning Phrase-Based Spelling Error Models from Clickthrough Data
This paper explores the use of clickthrough data for query spelling correction. First, large amounts of query-correction pairs are derived by analyzing users' query reformula...
Xu Sun, Jianfeng Gao, Daniel Micol, Chris Quirk
KDD
2005
ACM
149views Data Mining» more  KDD 2005»
14 years 1 months ago
A distributed learning framework for heterogeneous data sources
We present a probabilistic model-based framework for distributed learning that takes into account privacy restrictions and is applicable to scenarios where the different sites ha...
Srujana Merugu, Joydeep Ghosh