Most databases contain “name constants” like course numbers, personal names, and place names that correspond to entities in the real world. Previous work in integration of het...
Hypertext interfaces are considered appropriate for information exploration tasks. The prohibitively expensive link creation effort, however, prevents traditional hypertext interf...
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
Topic distillation aims at finding key resources which are high-quality pages for certain topics. With analysis in non-content features of key resources, a pre-selection method is ...
With the increasing use of ontologies in Semantic Web and enterprise knowledge management, it is critical to develop scalable and efficient ontology management systems. In this pap...
Jian Zhou, Li Ma, Qiaoling Liu, Lei Zhang, Yong Yu...