As massive document repositories and knowledge management systems continue to expand, in proprietary environments as well as on the Web, the need for duplicate detection becomes i...
Abstract. Recent times have seen an explosive growth in the availability of various kinds of data. It has resulted in an unprecedented opportunity to develop automated data-driven ...
Modern digital libraries require user-friendly and yet responsive access to the rapidly growing, heterogeneous, and distributed collection of information sources. However, the inc...
We built a system for the automatic creation of a textbased topic hierarchy, meant to be used in a geographically defined community. This poses two main problems. First, the appea...
The Internet brings us access to multimedia databases with billions of data instances. The massive amount of data available to researchers and application developers brings both o...