Backward beam search for dependency analysis of Japanese is proposed. As dependencies normally go from left to right in Japanese, it is effective to analyze sentences backwards (f...
We have developed a technique to characterize software developers' styles using a set of source code metrics. This style fingerprint can be used to identify the likely author...
Recent years have seen a growing interest in research into patent retrieval. One of the key issues in conducting information retrieval (IR) research is meaningful evaluation of the...
1 We consider the problem of similarity search in applications where the cost of computing the similarity between two records is very expensive, and the similarity measure is not a...
Chris Jermaine, Fei Xu, Mingxi Wu, Ravi Jampani, T...
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...