

Building a research library for the history of the web

14 years 6 months ago
Building a research library for the history of the web
This paper describes the building of a research library for studying the Web, especially research on how the structure and content of the Web change over time. The library is particularly aimed at supporting social scientists for whom the Web is both a fascinating social phenomenon and a mirror on society. The library is built on the collections of the Internet Archive, which has been preserving a crawl of the Web every two months since 1996. The technical challenges in organizing this data for research fall into two categories: high-performance computing to transfer and manage the very large amounts of data, and human-computer interfaces that empower research by non-computer specialists. Categories and Subject Descriptors H.3.7 [Information Storage and Retrieval]: Digital Libraries – collection, systems issues, user issues. J.4 [Social and Behavioral Sciences]: sociology. General Terms Algorithms, Management, Measurement, Performance, Design, Human Factors. Keywords history of the ...
William Y. Arms, Selcuk Aya, Pavel Dmitriev, Blaze
Added 14 Jun 2010
Updated 14 Jun 2010
Type Conference
Year 2006
Where JCDL
Authors William Y. Arms, Selcuk Aya, Pavel Dmitriev, Blazej J. Kot, Ruth Mitchell, Lucia Walle
Comments (0)