large datasets | Sciweavers

360

SBP
2011
Springer

413views Applied Computing» more SBP 2011»

Identifying Health-Related Topics on Twitter - An Exploration of Tobacco-Related Tweets as a Test Topic

15 years 1 months ago

Public health-related topics are diﬃcult to identify in large conversational datasets like Twitter. This study examines how to model and discover public health topics and themes ...

Kyle W. Prier, Matthew S. Smith, Christophe G. Gir...

claim paper

Read More »

202

click to vote

PVLDB
2011

232views Computer Networks» more PVLDB 2011»

Social Content Matching in MapReduce

15 years 1 months ago

Download www.vldb.org

Matching problems are ubiquitous. They occur in economic markets, labor markets, internet advertising, and elsewhere. In this paper we focus on an application of matching for soci...

Gianmarco De Francisci Morales, Aristides Gionis, ...

claim paper

Read More »

351

click to vote

ICASSP
2011
IEEE

446views Signal Processing» more ICASSP 2011»

Searching in one billion vectors: re-rank with source coding

15 years 3 months ago

Download hal.archives-ouvertes.fr

Recent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that ...

Hervé Jégou and Romain Tavenard and Matthijs Dou...

posted by hjegou

Read More »

185

click to vote

IIR
2010

110views Information Technology» more IIR 2010»

Selecting Features for Ordinal Text Classification

15 years 4 months ago

Download sunsite.informatik.rwth-aachen.de

We present four new feature selection methods for ordinal regression and test them against four different baselines on two large datasets of product reviews.

Stefano Baccianella, Andrea Esuli, Fabrizio Sebast...

claim paper

Read More »

191

click to vote

PC
2007

173views Management» more PC 2007»

Parallel graphics and visualization

15 years 6 months ago

Download people.freedesktop.org

Parallel volume rendering is one of the most efﬁcient techniques to achieve real time visualization of large datasets by distributing the data and the rendering process over a c...

Luís Paulo Santos, Bruno Raffin, Alan Heiri...

claim paper

Read More »

197

click to vote

BMCBI
2008

133views more BMCBI 2008»

A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data

15 years 6 months ago

Download www.biomedcentral.com

Background: Microarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible f...

Luca Corradi, Marco Fato, Ivan Porro, Silvia Scagl...

claim paper

Read More »

203

click to vote

BMCBI
2010

134views more BMCBI 2010»

R-Gada: a fast and flexible pipeline for copy number analysis in association studies

15 years 6 months ago

Download www.biomedcentral.com

Background: Genome-wide association studies (GWAS) using Copy Number Variation (CNV) are becoming a central focus of genetic research. CNVs have successfully provided target genom...

Roger Pique-Regi, Alejandro Cáceres, Juan R...

claim paper

Read More »

177

click to vote

BMCBI
2010

151views more BMCBI 2010»

Data reduction for spectral clustering to analyze high throughput flow cytometry data

15 years 6 months ago

Download www.biomedcentral.com

Background: Recent biological discoveries have shown that clustering large datasets is essential for better understanding biology in many areas. Spectral clustering in particular ...

Habil Zare, Parisa Shooshtari, Arvind Gupta, Ryan ...

claim paper

Read More »

202

click to vote

NIPS
2004

146views Information Technology» more NIPS 2004»

Efficient Kernel Machines Using the Improved Fast Gauss Transform

15 years 8 months ago

Download books.nips.cc

The computation and memory required for kernel machines with N training samples is at least O(N2 ). Such a complexity is significant even for moderate size problems and is prohibi...

Changjiang Yang, Ramani Duraiswami, Larry S. Davis

claim paper

Read More »

205

click to vote

FLAIRS
2001

114views Artificial Intelligence» more FLAIRS 2001»

Hierarchical Representatives Clustering with Hybrid Approach

15 years 8 months ago

Download www.aaai.org

Clustering is a discoveringprocess of meaningfulintbrmationby groupingsimilar data into compactclusters. Mostof traditional clustering methodsare in favor of small datasets andhav...

Byung-Joo An, Eunju Kim, Yillbyung Lee

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers