Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
Abstract. Smooth boosting algorithms are variants of boosting methods which handle only smooth distributions on the data. They are proved to be noise-tolerant and can be used in th...
In recent work we have developed a novel approach to the design and implementation of an online portal (ePortal) to help application engineers find replacements for electronic par...
Information retrieval (IR) research has been very active over the last decades to develop approaches that allow machine indexing to significantly improve indexing practice in lib...
Most approaches to classifying media content assume a fixed, closed vocabulary of labels. In contrast, we advocate machine learning approaches which take advantage of the millions...