We propose a new technique for clustering of text documents that relies on a biclustering structure constructed on terms and documents. Our approach makes use of a greedy algorith...
Many document images are rich in color and have complex background. To detect text from them, a standard approach utilizes both color and binary information. This often leads to t...
This paper demonstrates a new method for leveraging unstructured annotations to infer semantic document properties. We consider the domain of product reviews, which are often anno...
S. R. K. Branavan, Harr Chen, Jacob Eisenstein, Re...
Documents written in natural languages constitute a major part of the software engineering lifecycle artifacts. Especially during software maintenance or reverse engineering, seman...
In this work, we present two new approaches based on variable neighborhood search (VNS) and ant colony optimization (ACO) for the reconstruction of cross cut shredded text documen...