Sciweavers

1582 search results - page 302 / 317
» Digital Documents and Media
Sort
View
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
14 years 8 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee
KDD
2004
ACM
154views Data Mining» more  KDD 2004»
14 years 8 months ago
Diagnosing extrapolation: tree-based density estimation
There has historically been very little concern with extrapolation in Machine Learning, yet extrapolation can be critical to diagnose. Predictor functions are almost always learne...
Giles Hooker
KDD
2004
ACM
210views Data Mining» more  KDD 2004»
14 years 8 months ago
Probabilistic author-topic models for information discovery
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...
KDD
2003
ACM
99views Data Mining» more  KDD 2003»
14 years 8 months ago
Fragments of order
High-dimensional collections of 0-1 data occur in many applications. The attributes in such data sets are typically considered to be unordered. However, in many cases there is a n...
Aristides Gionis, Teija Kujala, Heikki Mannila
SIGMOD
2009
ACM
269views Database» more  SIGMOD 2009»
14 years 7 months ago
Efficient approximate entity extraction with edit distance constraints
Named entity recognition aims at extracting named entities from unstructured text. A recent trend of named entity recognition is finding approximate matches in the text with respe...
Wei Wang 0011, Chuan Xiao, Xuemin Lin, Chengqi Zha...