Sciweavers

363 search results - page 56 / 73
» Analyzing Large Collections of Email
Sort
View
HPDC
2005
IEEE
14 years 1 months ago
Genetic algorithm based automatic data partitioning scheme for HPF
good data partitioning scheme is the need of the time. However it is very diflcult to arrive at a good solution as the number of possible dutupartitionsfor a given real lifeprogra...
Sunil Kumar Anand, Y. N. Srikant
AND
2009
13 years 5 months ago
Digital weight watching: reconstruction of scanned documents
A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...
Tim Gielissen, Maarten Marx
SIGMOD
2010
ACM
362views Database» more  SIGMOD 2010»
13 years 2 months ago
Data warehousing and analytics infrastructure at facebook
Scalable analysis on large data sets has been core to the functions of a number of teams at Facebook - both engineering and nonengineering. Apart from ad hoc analysis of data and ...
Ashish Thusoo, Zheng Shao, Suresh Anthony, Dhruba ...
ICDE
2009
IEEE
135views Database» more  ICDE 2009»
14 years 9 months ago
Space-Constrained Gram-Based Indexing for Efficient Approximate String Search
Abstract-- Answering approximate queries on string collections is important in applications such as data cleaning, query relaxation, and spell checking, where inconsistencies and e...
Alexander Behm, Shengyue Ji, Chen Li, Jiaheng Lu
ICDE
2010
IEEE
408views Database» more  ICDE 2010»
14 years 2 months ago
Hive - a petabyte scale data warehouse using Hadoop
— The size of data sets being collected and analyzed in the industry for business intelligence is growing rapidly, making traditional warehousing solutions prohibitively expensiv...
Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zhen...