In this work, we study similarity measures for text-centric XML documents based on an extended vector space model, which considers both document content and structure. Experimenta...
Recent work has demonstrated that the assessment of pairwise object similarity can be approached in an axiomatic manner using information theory. We extend this concept specifica...
Abstract. To automatically retrieve documents or images from a database, retrieval systems use similarity measures to compare a request based on features extracted from the documen...
The ability to distinguish statistically different populations of speakers or writers can be an important asset in many NLP applications. In this paper, we describe a method of us...
Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic simila...
Xiaodan Zhang, Liping Jing, Xiaohua Hu, Michael K....