CT The World Wide Web has since its beginning provided linking to and from text documents encoded in HTML. The Web has evolved and most Web browsers now support a rich set of media...
Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...
Maximizing only the relevance between queries and documents will not satisfy users if they want the top search results to present a wide coverage of topics by a few representative...
Yi Liu, Benyu Zhang, Zheng Chen, Michael R. Lyu, W...
This paper presents a framework for user-oriented text mining. It is then illustrated with an example of discovering knowledge from competitors’ websites. The knowledge to be di...
In this paper we present the architectural blueprint and a prototypical implementation of Oryx, a visual environment that facilitates zero-installation for creating, sharing, and d...