Sciweavers

1052 search results - page 7 / 211
» Improved CHAID algorithm for document structure modelling
Sort
View
ICDAR
2011
IEEE
12 years 7 months ago
A Handwritten Character Extraction Algorithm for Multi-language Document Image
—In this paper, we propose a novel method for extracting handwritten characters from multi-language document images, which may contain various types of characters, e.g. Chinese, ...
Yonghong Song, Guilin Xiao, Yuanlin Zhang, Lei Yan...
RIDE
2002
IEEE
14 years 15 days ago
Enhancive Index for Structured Document
Structured documents, especially the XML documents, are made up of a few logical components, such as title, sections, subsections and paragraphs. The components in each structured...
Xiaoling Wang, Ji-Rong Wen, Yisheng Dong, Wenyin L...
CIKM
2006
Springer
13 years 11 months ago
Text classification improved through multigram models
Classification algorithms and document representation approaches are two key elements for a successful document classification system. In the past, much work has been conducted to...
Dou Shen, Jian-Tao Sun, Qiang Yang, Zheng Chen
ACL
2009
13 years 5 months ago
Markov Random Topic Fields
Most approaches to topic modeling assume an independence between documents that is frequently violated. We present an topic model that makes use of one or more user-specified grap...
Hal Daume III
CIKM
2008
Springer
13 years 9 months ago
A generative retrieval model for structured documents
Structured documents contain elements defined by the author(s) and annotations assigned by other people or processes. Structured documents pose challenges for probabilistic retrie...
Le Zhao, Jamie Callan