Sciweavers

1950 search results - page 37 / 390
» Informative sampling for large unbalanced data sets
Sort
View
SIGIR
2009
ACM
14 years 2 months ago
SUSHI: scoring scaled samples for server selection
Modern techniques for distributed information retrieval use a set of documents sampled from each server, but these samples have been underutilised in server selection. We describe...
Paul Thomas, Milad Shokouhi
BMCBI
2010
139views more  BMCBI 2010»
13 years 7 months ago
A highly efficient multi-core algorithm for clustering extremely large datasets
Background: In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput t...
Johann M. Kraus, Hans A. Kestler
WWW
2008
ACM
14 years 8 months ago
Statistical properties of community structure in large social and information networks
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its member...
Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, Mi...
EUSFLAT
2009
163views Fuzzy Logic» more  EUSFLAT 2009»
13 years 5 months ago
A Fuzzy Set Approach to Ecological Knowledge Discovery
Besides the problem of searching for effective methods for extracting knowledge from large databases (KDD) there are some additional problems with handling ecological data, namely ...
Arkadiusz Salski
LREC
2010
176views Education» more  LREC 2010»
13 years 9 months ago
There's no Data like More Data? Revisiting the Impact of Data Size on a Classification Task
In the paper we investigate the impact of data size on a Word Sense Disambiguation task (WSD). We question the assumption that the knowledge acquisition bottleneck, which is known...
Ines Rehbein, Josef Ruppenhofer