large datasets | Sciweavers

185

Voted

HCI
2007

134views Human Computer Interaction» more HCI 2007»

FPF-SB : A Scalable Algorithm for Microarray Gene Expression Data Clustering

15 years 8 months ago

Efficient and effective analysis of large datasets from microarray gene expression data is one of the keys to time-critical personalized medicine. The issue we address here is the ...

Filippo Geraci, Mauro Leoncini, Manuela Montangero...

claim paper

Read More »

233

click to vote

EMNLP
2008

234views Natural Language Processing» more EMNLP 2008»

Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce

15 years 8 months ago

Download www.umiacs.umd.edu

This paper explores the challenge of scaling up language processing algorithms to increasingly large datasets. While cluster computing has been available in commercial environment...

Jimmy J. Lin

claim paper

Read More »

183

click to vote

ADMA
2005
Springer

124views Data Mining» more ADMA 2005»

Finding All Frequent Patterns Starting from the Closure

15 years 8 months ago

Download webdocs.cs.ualberta.ca

Eﬃcient discovery of frequent patterns from large databases is an active research area in data mining with broad applications in industry and deep implications in many areas of d...

Mohammad El-Hajj, Osmar R. Zaïane

claim paper

Read More »

198

click to vote

DBVIS
1995

162views Database» more DBVIS 1995»

LadMan: A Large Data Management System

15 years 10 months ago

Download www.vissoft.de

More and more of our customers have to deal with very large datasets like elevation data and digital roadmaps covering Europe or even the entire world, very large images e.g. from...

Walter Schmeing

claim paper

Read More »

168

click to vote

EDBT
2000
ACM

125views Computer Science» more EDBT 2000»

Mining Classification Rules from Datasets with Large Number of Many-Valued Attributes

15 years 10 months ago

Download www.anderson.ucla.edu

Decision tree induction algorithms scale well to large datasets for their univariate and divide-and-conquer approach. However, they may fail in discovering effective knowledge when...

Giovanni Giuffrida, Wesley W. Chu, Dominique M. Ha...

claim paper

Read More »

145

click to vote

CLADE
2004
IEEE

110views Distributed And Parallel Com...» more CLADE 2004»

Grid Service for Visualization and Analysis of Remote Fusion Data

15 years 10 months ago

Download www.txcorp.com

Simulations and experiments in the fusion and plasma physics community generate large datasets at remote sites. Visualization and analysis of these datasets are difficult because ...

Svetlana G. Shasharina, Nanbor Wang, John R. Cary

claim paper

Read More »

193

click to vote

SIGMOD
1996
ACM

151views Database» more SIGMOD 1996»

BIRCH: An Efficient Data Clustering Method for Very Large Databases

15 years 10 months ago

Download www.cs.sfu.ca

Finding useful patterns in large datasets has attracted considerable interest recently, and one of the most widely st,udied problems in this area is the identification of clusters...

Tian Zhang, Raghu Ramakrishnan, Miron Livny

claim paper

Read More »

157

click to vote

KDD
1998
ACM

120views Data Mining» more KDD 1998»

Large Datasets Lead to Overly Complex Models: An Explanation and a Solution

15 years 11 months ago

Download www.cs.arizona.edu

This paper explores unexpected results that lie at the intersection of two common themes in the KDD community: large datasets and the goal of building compact models. Experiments ...

Tim Oates, David Jensen

claim paper

Read More »

146

click to vote

ACSC
2002
IEEE

110views Theoretical Computer Science» more ACSC 2002»

Using Finite State Automata for Sequence Mining

15 years 11 months ago

Download crpit.com

We show how frequently occurring sequential patterns may be found from large datasets by first inducing a finite state automaton model describing the data, and then querying the m...

Philip Hingston

claim paper

Read More »

181

Voted

CLOUD
2010
ACM

178views Distributed And Parallel Com...» more CLOUD 2010»

Towards automatic optimization of MapReduce programs

15 years 11 months ago

Download www.cs.duke.edu

Timely and cost-effective processing of large datasets has become a critical ingredient for the success of many academic, government, and industrial organizations. The combination...

Shivnath Babu

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers