Abstract. This paper examines technology developed to support largescale distributed digital libraries. We describe the method used for harvesting collection information using stan...
Information on the Web is not only abundant but also redundant. This redundancy of information has an important consequence on the relation between the recall of an information ga...
What makes template content in the Web so special that we need to remove it? In this paper I present a large-scale aggregate analysis of textual Web content, corroborating statist...
In this paper, we discuss a prototype application deployed at the U.S. National Science Foundation for assisting program directors in identifying reviewers for proposals. The appl...
Early modern books written in Latin contain many abbreviations of common words that are derived from earlier manuscript practice. While these abbreviations are usually easily deci...