Sciweavers

3450 search results - page 641 / 690
» Media Content Analysis
Sort
View
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
14 years 10 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee
KDD
2005
ACM
125views Data Mining» more  KDD 2005»
14 years 10 months ago
Email data cleaning
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
Jie Tang, Hang Li, Yunbo Cao, ZhaoHui Tang
KDD
2004
ACM
131views Data Mining» more  KDD 2004»
14 years 10 months ago
Fast nonlinear regression via eigenimages applied to galactic morphology
Astronomy increasingly faces the issue of massive datasets. For instance, the Sloan Digital Sky Survey (SDSS) has so far generated tens of millions of images of distant galaxies, ...
Brigham Anderson, Andrew W. Moore, Andrew Connolly...
KDD
2004
ACM
163views Data Mining» more  KDD 2004»
14 years 10 months ago
Exploiting dictionaries in named entity extraction: combining semi-Markov extraction processes and data integration methods
We consider the problem of improving named entity recognition (NER) systems by using external dictionaries--more specifically, the problem of extending state-of-the-art NER system...
William W. Cohen, Sunita Sarawagi
KDD
2001
ACM
203views Data Mining» more  KDD 2001»
14 years 10 months ago
Ensemble-index: a new approach to indexing large databases
The problem of similarity search (query-by-content) has attracted much research interest. It is a difficult problem because of the inherently high dimensionality of the data. The ...
Eamonn J. Keogh, Selina Chu, Michael J. Pazzani