Sciweavers

1477 search results - page 60 / 296
» What's the point of documentation
Sort
View
ICML
2006
IEEE
14 years 8 months ago
Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Charles Elkan
KDD
2005
ACM
135views Data Mining» more  KDD 2005»
14 years 8 months ago
A hybrid unsupervised approach for document clustering
We propose a hybrid, unsupervised document clustering approach that combines a hierarchical clustering algorithm with Expectation Maximization. We developed several heuristics to ...
Mihai Surdeanu, Jordi Turmo, Alicia Ageno
ICDAR
2005
IEEE
14 years 1 months ago
Text Recognition of Low-resolution Document Images
Cheap and versatile cameras make it possible to easily and quickly capture a wide variety of documents. However, low resolution cameras present a challenge to OCR because it is vi...
Charles E. Jacobs, Patrice Y. Simard, Paul A. Viol...
DEXAW
1995
IEEE
101views Database» more  DEXAW 1995»
13 years 11 months ago
Principles and Tools for Authoring Knowledge-Rich Documents
Digital libraries can take advantage of documents that have their content (semantics) explicitly represented as knowledge structures. These knowledge-rich documents can be created ...
Robert P. Futrelle, Natalya Fridman Noy
ERCIMDL
2010
Springer
141views Education» more  ERCIMDL 2010»
13 years 8 months ago
DINAH, A Philological Platform for the Construction of Multi-structured Documents
Abstract. We consider how the construction of multi-structured documents implies the definition of structuration vocabularies. In a multiusers context, the growth of these vocabula...
Pierre-Edouard Portier, Sylvie Calabretto