The goal of the DARPA MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Program is to automatically convert foreign language text images into Englis...
In this study, we describe our system at the Intellectual Property track of the 2009 CrossLanguage Evaluation Forum campaign (CLEF-IP). The CLEF-IP track addressed prior art searc...
Diagrams are a critical part of virtually all scientific and technical documents. Analyzing diagrams will be important for building comprehensive document retrieval systems. This ...
Robert P. Futrelle, Mingyan Shao, Chris Cieslik, A...
Kernel Canonical Correlation Analysis (KCCA) is a method of correlating linear relationship between two variables in a kernel defined feature space. A machine learning algorithm b...
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...