. When publishing documents on the web, the user needs to describe and classify her documents for the benefit of later retrieval and use. This paper presents an approach to semanti...
One of the most important aspects of a Web document is its up-to-dateness or recency. Up-to-dateness is particularly relevant to Web documents because they usually contain content...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
Today, search engine is the most commonly used tool for Web information retrieval, however, its current status is still far from satisfaction. This paper focuses on clustering Web...
Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic simila...
Xiaodan Zhang, Liping Jing, Xiaohua Hu, Michael K....