Sciweavers

339 search results - page 49 / 68
» Collection Maintenance in the Digital Library
Sort
View
CLEF
2005
Springer
14 years 1 months ago
EuroGOV: Engineering a Multilingual Web Corpus
EuroGOV is a multilingual web corpus that was created to serve as the document collection for WebCLEF, the CLEF 2005 web retrieval task. EuroGOV is a collection of web pages crawl...
Börkur Sigurbjörnsson, Jaap Kamps, Maart...
ERCIMDL
2005
Springer
124views Education» more  ERCIMDL 2005»
14 years 1 months ago
A Hybrid Declarative/Procedural Metadata Mapping Language Based on Python
The Alexandria Digital Library (ADL) project has been working on automating the processes of building ADL collections and gathering the collection statistics on which ADL’s disco...
Greg Janee, James Frew
SIGIR
2004
ACM
14 years 1 months ago
Multiple sources of evidence for XML retrieval
Document-centric XML collections contain text-rich documents, marked up with XML tags. The tags add lightweight semantics to the text. Querying such collections calls for a hybrid...
Börkur Sigurbjörnsson, Jaap Kamps, Maart...
JCDL
2003
ACM
136views Education» more  JCDL 2003»
14 years 28 days ago
The OAI-PMH Static Repository and Static Repository Gateway
Although the OAI-PMH specification is focused on making it straightforward for data providers to expose metadata, practice shows that in certain significant situations deployment ...
Patrick Hochstenbach, Henry N. Jerez, Herbert Van ...
SIGIR
2010
ACM
13 years 2 months ago
Efficient partial-duplicate detection based on sequence matching
With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...
Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang