To circumvent prevalent text-based anti-spam filters, spammers have begun embedding the advertisement text in images. Analogously, proprietary information (such as source code) ma...
Hrishikesh Aradhye, Gregory K. Myers, James A. Her...
As Adobe's Portable Document Format has exploded in popularity so too has the number PDF generators, and predictably the quality of generated PDF varies considerably. This pa...
We describe an algorithm for computing an image signature, suitable for first-stage screening for duplicate images. Our signature relies on relative brightness of image regions, a...
An innovative algorithm for automatic generation of Huffman coding tables for semantic classes of digital images is presented. Collecting statistics over a large dataset of corresp...
It is important for future NLP systems to formulate the semantic equivalence (and more generally, the semantic similarity) of natural language expressions. In particular, paraphra...