As the number of non-English resources available on the Web is increasing rapidly, developing information retrieval techniques for non-English languages is becoming an urgent and ...
Abstract. This paper introduces ThreadMill - a distributed and parallel component architecture for applications that process large volumes of streamed (time-sequenced) data, such a...
In a federated digital library system, it is too expensive to query every accessible library. Resource selection is the task to decide to which libraries a query should be routed....
Abstract-- The operational characteristics of integrated circuits based on nanoscale semiconductor technology are expected to be increasingly affected by variations in the manufact...
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...