—We present an information retrieval model for combining evidence from concept-based semantics, term statistics, and context for improving search precision of genomics literature...
This paper describes an unsupervised algorithm for placing unknown words into a taxonomy and evaluates its accuracy on a large and varied sample of words. The algorithm works by ï...
There is no blank to mark word boundaries in Chinese text. As a result, identifying words is difficult, because of segmentation ambiguities and occurrences of unknown words. Conve...
Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents. BOW, however, ignores the ...
Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua L...
E-commerce and knowledge management applications generate and consume tremendous amounts of online information that is typically available as textual documents. To facilitate subs...