Sciweavers

KDD
2008
ACM
120views Data Mining» more  KDD 2008»
14 years 11 months ago
Entity categorization over large document collections
Extracting entities (such as people, movies) from documents and identifying the categories (such as painter, writer) they belong to enable structured querying and data analysis ov...
Arnd Christian König, Rares Vernica, Venkates...
KDD
2008
ACM
257views Data Mining» more  KDD 2008»
14 years 11 months ago
Knowledge discovery of semantic relationships between words using nonparametric bayesian graph model
We developed a model based on nonparametric Bayesian modeling for automatic discovery of semantic relationships between words taken from a corpus. It is aimed at discovering seman...
Issei Sato, Minoru Yoshida, Hiroshi Nakagawa
KDD
2008
ACM
206views Data Mining» more  KDD 2008»
14 years 11 months ago
Identifying biologically relevant genes via multiple heterogeneous data sources
Selection of genes that are differentially expressed and critical to a particular biological process has been a major challenge in post-array analysis. Recent development in bioin...
Zheng Zhao, Jiangxin Wang, Huan Liu, Jieping Ye, Y...
KDD
2008
ACM
156views Data Mining» more  KDD 2008»
14 years 11 months ago
Can complex network metrics predict the behavior of NBA teams?
The United States National Basketball Association (NBA) is one of the most popular sports league in the world and is well known for moving a millionary betting market that uses th...
Antonio Alfredo Ferreira Loureiro, Pedro O. S. Vaz...
KDD
2008
ACM
146views Data Mining» more  KDD 2008»
14 years 11 months ago
Constraint programming for itemset mining
The relationship between constraint-based mining and constraint programming is explored by showing how the typical constraints used in pattern mining can be formulated for use in ...
Luc De Raedt, Tias Guns, Siegfried Nijssen
KDD
2008
ACM
234views Data Mining» more  KDD 2008»
14 years 11 months ago
Angle-based outlier detection in high-dimensional data
Detecting outliers in a large set of data objects is a major data mining task aiming at finding different mechanisms responsible for different groups of objects in a data set. All...
Hans-Peter Kriegel, Matthias Schubert, Arthur Zime...
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
14 years 11 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
KDD
2008
ACM
140views Data Mining» more  KDD 2008»
14 years 11 months ago
Semi-supervised approach to rapid and reliable labeling of large data sets
Supervised classification methods have been shown to be very effective for a large number of applications. They require a training data set whose instances are labeled to indicate...
György J. Simon, Vipin Kumar, Zhi-Li Zhang
KDD
2008
ACM
167views Data Mining» more  KDD 2008»
14 years 11 months ago
A sequential dual method for large scale multi-class linear svms
Efficient training of direct multi-class formulations of linear Support Vector Machines is very useful in applications such as text classification with a huge number examples as w...
S. Sathiya Keerthi, S. Sundararajan, Kai-Wei Chang...
KDD
2008
ACM
192views Data Mining» more  KDD 2008»
14 years 11 months ago
Partial least squares regression for graph mining
Attributed graphs are increasingly more common in many application domains such as chemistry, biology and text processing. A central issue in graph mining is how to collect inform...
Hiroto Saigo, Koji Tsuda, Nicole Krämer