Sciweavers

125 search results - page 5 / 25
» Minimizing the Network Distance in Distributed Web Crawling
Sort
View
LREC
2010
217views Education» more  LREC 2010»
13 years 10 months ago
Building a Web Corpus of Czech
Large corpora are essential to modern methods of computational linguistics and natural language processing. In this paper, we describe an ongoing project whose aim is to build a l...
Drahomíra "johanka" Spoustová, Miros...
CLOUD
2010
ACM
14 years 1 months ago
Stateful bulk processing for incremental analytics
This work addresses the need for stateful dataflow programs that can rapidly sift through huge, evolving data sets. These data-intensive applications perform complex multi-step c...
Dionysios Logothetis, Christopher Olston, Benjamin...
CATE
2004
138views Education» more  CATE 2004»
13 years 10 months ago
A Web Portal for Open-Source Synchronous Distance Education
Network EducationWare (NEW) is an integrated collection of open-source software for synchronous Internet communication, where a class is simultaneously taught to local students an...
J. Mark Pullen, Priscilla M. McAndrews
HPDC
2003
IEEE
14 years 1 months ago
Distributed Pagerank for P2P Systems
This paper defines and describes a fully distributed implementation of Google’s highly effective Pagerank algorithm, for “peer to peer”(P2P) systems. The implementation is ...
Karthikeyan Sankaralingam, Simha Sethumadhavan, Ja...
SAC
2005
ACM
14 years 2 months ago
A distributed content-based search engine based on mobile code
Current search engines crawl the Web, download content, and digest this content locally. For multimedia content, this involves considerable volumes of data. Furthermore, this proc...
Volker Roth, Ulrich Pinsdorf, Jan Peters