Sciweavers

90 search results - page 15 / 18
» The lifecycle of a digital historical document: structure an...
Sort
View
CIKM
2004
Springer
14 years 24 days ago
Hierarchical document categorization with support vector machines
Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques ...
Lijuan Cai, Thomas Hofmann
GIR
2006
ACM
14 years 1 months ago
Associating spatial patterns to text-units for summarizing geographic information
Retrieving data based not only on key words is a challenge. We worked on semi-structured data (cultural heritage corpora). Our project aimed at getting the most relevant text-unit...
Julien Lesbegueries, Christian Sallaberry, Mauro G...
ERCIMDL
2007
Springer
115views Education» more  ERCIMDL 2007»
14 years 1 months ago
The Semantic GrowBag Algorithm: Automatically Deriving Categorization Systems
Using keyword search to find relevant objects in digital libraries often results in way too large result sets. Based on the metadata associated with such objects, the faceted sear...
Jörg Diederich, Wolf-Tilo Balke
CIKM
2009
Springer
14 years 2 months ago
Completing wikipedia's hyperlink structure through dimensionality reduction
Wikipedia is the largest monolithic repository of human knowledge. In addition to its sheer size, it represents a new encyclopedic paradigm by interconnecting articles through hyp...
Robert West, Doina Precup, Joelle Pineau
JCDL
2006
ACM
167views Education» more  JCDL 2006»
14 years 1 months ago
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
Jie Zou, Daniel X. Le, George R. Thoma