NLPsystem developers and corpus lexicographers would both bene t from a tool for nding and organizing the distinctive patterns of use of words in texts. Such a tool would be an ass...
Similarity analysis of source code is helpful during development to provide, for instance, better support for code reuse. Consider a development environment that analyzes code whi...
Tobias Sager, Abraham Bernstein, Martin Pinzger, C...
Stream-processing systems are designed to support an emerging class of applications that require sophisticated and timely processing of high-volume data streams, often originating...
Alex Rasin, Jeong-Hyon Hwang, Magdalena Balazinska...
In many Web applications, such as blog classification and newsgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain ...
We study in this paper the Web forum crawling problem, which is a very fundamental step in many Web applications, such as search engine and Web data mining. As a typical user-crea...
Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, Lei ...