Although text categorization is a burgeoning area of IR research, readily available test collections in this field are surprisingly scarce. We describe a methodology and system (...
We address the problem of querying XML data over a P2P network. In P2P networks, the allowed kinds of queries are usually exact-match queries over file names. We discuss the exte...
UML class diagrams can be used as a language for expressing a conceptual model of a domain. In a series of papers [1,2,3] we have been using the General Ontological Language (GOL) ...
An important class of queries is the LIKE predicate in SQL. In the absence of an index, LIKE queries are subject to performance degradation. The notion of indexing on substrings (...
This paper presents a cluster validation based document clustering algorithm, which is capable of identifying both important feature words and true model order (cluster number). I...