Sciweavers

178 search results - page 31 / 36
» Scheduling Algorithms for Web Crawling
Sort
View
CCGRID
2001
IEEE
13 years 11 months ago
XtremWeb: A Generic Global Computing System
Global Computing achieves high throughput computing by harvesting a very large number of unused computing resources connected to the Internet. This parallel computing model target...
Gilles Fedak, Cécile Germain, Vincent N&eac...
AAAI
2010
13 years 9 months ago
Prioritization of Domain-Specific Web Information Extraction
It is often desirable to extract structured information from raw web pages for better information browsing, query answering, and pattern mining. Many such Information Extraction (...
Jian Huang, Cong Yu
OSDI
2008
ACM
14 years 7 months ago
Improving MapReduce Performance in Heterogeneous Environments
MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. Hadoop is an open-...
Matei Zaharia, Andy Konwinski, Anthony D. Joseph, ...
SOSP
2003
ACM
14 years 4 months ago
Capriccio: scalable threads for internet services
This paper presents Capriccio, a scalable thread package for use with high-concurrency servers. While recent work has advocated event-based systems, we believe that threadbased sy...
J. Robert von Behren, Jeremy Condit, Feng Zhou, Ge...
SIGCOMM
2010
ACM
13 years 7 months ago
NapSAC: design and implementation of a power-proportional web cluster
Energy consumption is a major and costly problem in data centers. A large fraction of this energy goes to powering idle machines that are not doing any useful work. We identify tw...
Andrew Krioukov, Prashanth Mohan, Sara Alspaugh, L...