Sciweavers

52 search results - page 6 / 11
» Representing OCRed documents in HTML
Sort
View
ICML
2006
IEEE
14 years 8 months ago
Dynamic topic models
A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections. The approach is to use state space models on the n...
David M. Blei, John D. Lafferty
ELPUB
2006
ACM
14 years 1 months ago
Electronic Publishing of Digitised Works
This paper describes the automated process to create structured master and access copies for the digitised works at the BND – National Digital Library. The BND created during 20...
João Penas, João Gil, Gilberto Pedro...
SAINT
2005
IEEE
14 years 1 months ago
Learning Logic Wrappers for Information Extraction from the Web
This paper discusses a methodology for applying general-purpose first-order inductive learning to extract information from Web documents structured as unranked ordered trees. The...
Costin Badica, Elvira Popescu, Amelia Badica
ANTSW
2004
Springer
14 years 29 days ago
How to Use Ants for Hierarchical Clustering
Abstract. We present in this paper, a new model for document hierarchical clustering, which is inspired from the self-assembly behavior of real ants. We have simulated the way ants...
Hanene Azzag, Christiane Guinot, Gilles Venturini
DOCENG
2011
ACM
12 years 7 months ago
Interoperable metadata semantics with meta-metadata: a use case integrating search engines
A use case involving integrating results from search engines illustrates how the meta-metadata language facilitates interoperable metadata semantics. Formal semantics can be hard ...
Yin Qu, Andruid Kerne, Andrew M. Webb, Aaron Herst...