Sciweavers

304 search results - page 21 / 61
» Parallel and Distributed Compressed Indexes
Sort
View
IPPS
2010
IEEE
13 years 6 months ago
Improving MapReduce performance through data placement in heterogeneous Hadoop clusters
MapReduce has become an important distributed processing model for large-scale data-intensive applications like data mining and web indexing. Hadoop
Jiong Xie, Shu Yin, Xiaojun Ruan, Zhiyang Ding, Yu...
SIGMOD
2007
ACM
179views Database» more  SIGMOD 2007»
14 years 8 months ago
How to barter bits for chronons: compression and bandwidth trade offs for database scans
Two trends are converging to make the CPU cost of a table scan a more important component of database performance. First, table scans are becoming a larger fraction of the query p...
Allison L. Holloway, Vijayshankar Raman, Garret Sw...
IPPS
1993
IEEE
14 years 18 days ago
Supporting Insertions and Deletions in Striped Parallel Filesystems
The dramatic improvements in the processing rates of parallel computers are turning many compute-bound jobs into IO-bound jobs. Parallel le systems have been proposed to better ma...
Theodore Johnson
WWW
2005
ACM
14 years 9 months ago
Improving Web search efficiency via a locality based static pruning method
The unarguably fast, and continuous, growth of the volume of indexed (and indexable) documents on the Web poses a great challenge for search engines. This is true regarding not on...
Edleno Silva de Moura, Célia Francisca dos ...
IPPS
2008
IEEE
14 years 2 months ago
Multi-threaded data mining of EDGAR CIKs (Central Index Keys) from ticker symbols
This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
Dougal A. Lyon