Sciweavers

160 search results - page 5 / 32
» Exploiting structural information for semi-structured docume...
Sort
View
WEBI
2007
Springer
14 years 1 months ago
An unsupervised hierarchical approach to document categorization
— We propose a hierarchical approach to document categorization that requires no pre-configuration and maps the semantic document space to a predefined taxonomy. The utilizatio...
Robert Wetzker, Tansu Alpcan, Christian Bauckhage,...
CIKM
2004
Springer
14 years 28 days ago
Hierarchical document categorization with support vector machines
Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques ...
Lijuan Cai, Thomas Hofmann
BTW
2003
Springer
103views Database» more  BTW 2003»
14 years 23 days ago
XPath-Aware Chunking of XML-Documents
Dissemination systems are used to route information received from many publishers individually to multiple subscribers. The core of a dissemination system consists of an efficient...
Wolfgang Lehner, Florian Irmert
SIGIR
2012
ACM
11 years 10 months ago
Optimizing positional index structures for versioned document collections
Versioned document collections are collections that contain multiple versions of each document. Important examples are Web archives, Wikipedia and other wikis, or source code and ...
Jinru He, Torsten Suel
ECIR
2011
Springer
12 years 11 months ago
Exploiting Thread Structures to Improve Smoothing of Language Models for Forum Post Retrieval
Due to many unique characteristics of forum data, forum post retrieval is different from traditional document retrieval and web search, raising interesting research questions abou...
Huizhong Duan, Chengxiang Zhai