In this paper we propose to define a measure of visual similarity to compare different pages in a corpus. This measure is based on the analysis of the visual layout saliency of th...
Text classification has matured as a research discipline over the last decade. Independently, business intelligence over structured databases has long been a source of insights fo...
Text visualization becomes an increasingly more important research topic as the need to understand massive-scale textual information is proven to be imperative for many people and...
Lei Shi, Furu Wei, Shixia Liu, Li Tan, Xiaoxiao Li...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
In this paper, we examine how conclusions about linkability threats can be drawn by analyzing message contents and subject knowledge in arbitrary communication systems. At first, ...