An ad hoc data format is any non-standard, semi-structured data format for which robust data processing tools are not available. In this paper, we present ANNE, a new kind of mark...
In this paper we present the Infocious Web search engine [23]. Our goal in creating Infocious is to improve the way people find information on the Web by resolving ambiguities pre...
This paper focuses on automatically improving the readability of documents. We explore mechanisms relating to content control that could be used (i) by authors to improve the qual...
In this article we review some recent interactions between harmonic analysis and data compression. The story goes back of course to Shannon’s R(D) theory in the case of Gaussian...
David L. Donoho, Martin Vetterli, Ronald A. DeVore...
Abstract. One major goal of text mining is to provide automatic methods to help humans grasp the key ideas in ever-increasing text corpora. To this effect, we propose a statistica...