Sciweavers

488 search results - page 82 / 98
» General Database Statistics Using Entropy Maximization
Sort
View
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
14 years 8 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee
SAC
2009
ACM
14 years 2 months ago
Applying latent dirichlet allocation to group discovery in large graphs
This paper introduces LDA-G, a scalable Bayesian approach to finding latent group structures in large real-world graph data. Existing Bayesian approaches for group discovery (suc...
Keith Henderson, Tina Eliassi-Rad
SIGMOD
2001
ACM
104views Database» more  SIGMOD 2001»
14 years 8 months ago
Independence is Good: Dependency-Based Histogram Synopses for High-Dimensional Data
Approximating the joint data distribution of a multi-dimensional data set through a compact and accurate histogram synopsis is a fundamental problem arising in numerous practical ...
Amol Deshpande, Minos N. Garofalakis, Rajeev Rasto...
KDD
2000
ACM
101views Data Mining» more  KDD 2000»
13 years 11 months ago
Incremental quantile estimation for massive tracking
Data--call records, internet packet headers, or other transaction records--are coming down a pipe at a ferocious rate, and we need to monitor statistics of the data. There is no r...
Fei Chen, Diane Lambert, José C. Pinheiro
SIGSOFT
2010
ACM
13 years 5 months ago
The missing links: bugs and bug-fix commits
Empirical studies of software defects rely on links between bug databases and program code repositories. This linkage is typically based on bug-fixes identified in developer-enter...
Adrian Bachmann, Christian Bird, Foyzur Rahman, Pr...