Sciweavers

1950 search results - page 24 / 390
» Informative sampling for large unbalanced data sets
Sort
View
IV
2007
IEEE
160views Visualization» more  IV 2007»
14 years 1 months ago
Targeted Projection Pursuit for Interactive Exploration of High- Dimensional Data Sets
High-dimensional data is, by its nature, difficult to visualise. Many current techniques involve reducing the dimensionality of the data, which results in a loss of information. ...
Joe Faith
SIGMOD
2001
ACM
200views Database» more  SIGMOD 2001»
14 years 7 months ago
Data Bubbles: Quality Preserving Performance Boosting for Hierarchical Clustering
In this paper, we investigate how to scale hierarchical clustering methods (such as OPTICS) to extremely large databases by utilizing data compression methods (such as BIRCH or ra...
Markus M. Breunig, Hans-Peter Kriegel, Peer Kr&oum...
BMCBI
2007
147views more  BMCBI 2007»
13 years 7 months ago
Improved residue contact prediction using support vector machines and a large feature set
Background: Predicting protein residue-residue contacts is an important 2D prediction task. It is useful for ab initio structure prediction and understanding protein folding. In s...
Jianlin Cheng, Pierre Baldi
GECCO
2008
Springer
184views Optimization» more  GECCO 2008»
13 years 8 months ago
Analysis of mammography reports using maximum variation sampling
A genetic algorithm (GA) was developed to implement a maximum variation sampling technique to derive a subset of data from a large dataset of unstructured mammography reports. It ...
Robert M. Patton, Barbara G. Beckerman, Thomas E. ...
PAKDD
2010
ACM
222views Data Mining» more  PAKDD 2010»
14 years 13 days ago
Online Sampling of High Centrality Individuals in Social Networks
In this work, we investigate the use of online or “crawling” algorithms to sample large social networks in order to determine the most influential or important individuals wit...
Arun S. Maiya, Tanya Y. Berger-Wolf