We present a novel approach for classifying documents that combines different pieces of evidence (e.g., textual features of documents, links, and citations) transparently, through...
Adriano Veloso, Wagner Meira Jr., Marco Cristo, Ma...
The design of efficient textual similarities is an important issue in the domain of textual data exploration. Textual similarities are for example central in document collection s...
It is crucial in many information systems to organize short text segments, such as keywords in documents and queries from users, into a well-formed topic hierarchy. In this paper,...
Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques ...
Soon, much of the data exchanged over the Internet will be encoded in XML, allowing for sophisticated filtering and content-based routing. We have built a filtering engine called ...
Yanlei Diao, Peter M. Fischer, Michael J. Franklin...