Large-scale distributed data management with P2P systems requires the existence of similarity operators for queries as we cannot assume that all users will agree on exactly the sa...
This paper presents a novel domain-independent text segmentation method, which identifies the boundaries of topic changes in long text documents and/or text streams. The method c...
The blogosphere--the totality of blog-related Web sites-has become a great source of trend analysis in areas such as product survey, customer relationship, and marketing. Existing...
Traditional ranking mainly focuses on one type of data source, and effective modeling still relies on a sufficiently large number of labeled or supervised examples. However, in m...
Bo Wang, Jie Tang, Wei Fan, Songcan Chen, Zi Yang,...
The vast user-provided image tags on the popular photo sharing websites may greatly facilitate image retrieval and management. However, these tags are often imprecise and/or incom...