Sciweavers

260 search results - page 6 / 52
» Compression of Compound Documents
Sort
View
CIKM
1999
Springer
13 years 11 months ago
Word Segmentation and Recognition for Web Document Framework
It is observed that a better approach to Web information understanding is to base on its document framework, which is mainly consisted of (i) the title and the URL name of the pag...
Chi-Hung Chi, Chen Ding, Andrew Lim
ICML
2006
IEEE
14 years 8 months ago
Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Charles Elkan
WWW
2008
ACM
14 years 8 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
WWW
2006
ACM
14 years 8 months ago
Relaxed: on the way towards true validation of compound documents
To maintain interoperability in the Web environment it is necessary to comply with Web standards. Current specifications of HTML and XHTML languages define conformance conditions ...
Jirka Kosek, Petr Nálevka
CSUR
2000
86views more  CSUR 2000»
13 years 7 months ago
HotDoc: a framework for compound documents
applications parts. Programmers can easily implement new parts by using abstract classes of the framework. HotDoc is implemented in VisualWorks Smalltalk.
Jürgen Buchner