This paper presents a novel way of improving POS tagging on heterogeneous data. First, two separate models are trained (generalized and domain-specific) from the same data set by...
—Application level traffic classification is one of the major issues in network monitoring and traffic engineering. In our previous study, we proposed a new traffic classificatio...
Jae Yoon Chung, Byungchul Park, Young J. Won, John...
Multidocument extractive summarization relies on the concept of sentence centrality to identify the most important sentences in a document. Centrality is typically defined in term...
We describe our participation in the WebCLEF 2007 task, targeted at snippet retrieval from web data. Our system ranks snippets based on a simple similarity-based centrality, inspir...
MMR (Maximum Marginal Relevance) is widely used in summarization for its simplicity and efficacy, and has been demonstrated to achieve comparable performance to other approaches ...
We consider topic detection without any prior knowledge of category structure or possible categories. Keywords are extracted and clustered based on different similarity measures u...
This paper describes a research effort to improve the use of the cosine similarity information retrieval technique to detect unknown, known or variances of known rogue software by...
Social tagging describes a community of users labeling web content with tags. It is a simple activity that enriches our knowledge about resources on the web. For a computer to hel...
We study the following problem: how to efficiently find in a collection of strings those similar to a given query string? Various similarity functions can be used, such as edit dis...