Sciweavers

83 search results - page 5 / 17
» Building Useful Models from Imbalanced Data with Sampling an...
Sort
View
ICML
2004
IEEE
14 years 8 months ago
Boosting margin based distance functions for clustering
The performance of graph based clustering methods critically depends on the quality of the distance function, used to compute similarities between pairs of neighboring nodes. In t...
Tomer Hertz, Aharon Bar-Hillel, Daphna Weinshall
GECCO
2005
Springer
156views Optimization» more  GECCO 2005»
14 years 1 months ago
Extraction of informative genes from microarray data
Identification of those genes that might anticipate the clinical behavior of different types of cancers is challenging due to availability of a smaller number of patient samples...
Topon Kumar Paul, Hitoshi Iba
SIGMOD
2008
ACM
138views Database» more  SIGMOD 2008»
14 years 7 months ago
Sampling time-based sliding windows in bounded space
Random sampling is an appealing approach to build synopses of large data streams because random samples can be used for a broad spectrum of analytical tasks. Users are often inter...
Rainer Gemulla, Wolfgang Lehner
SIGSOFT
2009
ACM
14 years 8 months ago
Fair and balanced?: bias in bug-fix datasets
Software engineering researchers have long been interested in where and why bugs occur in code, and in predicting where they might turn up next. Historical bug-occurence data has ...
Christian Bird, Adrian Bachmann, Eirik Aune, John ...
ICVGIP
2004
13 years 8 months ago
Modeling Signs Using Functional Data Analysis
1 We present a functional data analysis (FDA) based method to statistically model continuous signs of the American Sign Language (ASL) for use in the recognition of signs in contin...
Sunita Nayak, Sudeep Sarkar, Kuntal Sengupta