Public health-related topics are difficult to identify in large conversational datasets like Twitter. This study examines how to model and discover public health topics and themes ...
Kyle W. Prier, Matthew S. Smith, Christophe G. Gir...
Matching problems are ubiquitous. They occur in economic markets, labor markets, internet advertising, and elsewhere. In this paper we focus on an application of matching for soci...
Gianmarco De Francisci Morales, Aristides Gionis, ...
Recent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that ...
Hervé Jégou and Romain Tavenard and Matthijs Dou...
We present four new feature selection methods for ordinal regression and test them against four different baselines on two large datasets of product reviews.
Stefano Baccianella, Andrea Esuli, Fabrizio Sebast...
Parallel volume rendering is one of the most efficient techniques to achieve real time visualization of large datasets by distributing the data and the rendering process over a c...
Background: Microarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible f...
Luca Corradi, Marco Fato, Ivan Porro, Silvia Scagl...
Background: Genome-wide association studies (GWAS) using Copy Number Variation (CNV) are becoming a central focus of genetic research. CNVs have successfully provided target genom...
Background: Recent biological discoveries have shown that clustering large datasets is essential for better understanding biology in many areas. Spectral clustering in particular ...
Habil Zare, Parisa Shooshtari, Arvind Gupta, Ryan ...
The computation and memory required for kernel machines with N training samples is at least O(N2 ). Such a complexity is significant even for moderate size problems and is prohibi...
Changjiang Yang, Ramani Duraiswami, Larry S. Davis
Clustering is a discoveringprocess of meaningfulintbrmationby groupingsimilar data into compactclusters. Mostof traditional clustering methodsare in favor of small datasets andhav...