Sciweavers

402 search results - page 79 / 81
» Exploring Digital Libraries with Document Image Retrieval
Sort
View
CIKM
2008
Springer
13 years 10 months ago
Efficient and effective link analysis with precomputed salsa maps
SALSA is a link-based ranking algorithm that takes the result set of a query as input, extends the set to include additional neighboring documents in the web graph, and performs a...
Marc Najork, Nick Craswell
SIGIR
2003
ACM
14 years 2 months ago
Domain-independent text segmentation using anisotropic diffusion and dynamic programming
This paper presents a novel domain-independent text segmentation method, which identifies the boundaries of topic changes in long text documents and/or text streams. The method c...
Xiang Ji, Hongyuan Zha
JCDL
2005
ACM
161views Education» more  JCDL 2005»
14 years 2 months ago
Downloading textual hidden web content through keyword queries
An ever-increasing amount of information on the Web today is available only through search interfaces: the users have to type in a set of keywords in a search form in order to acc...
Alexandros Ntoulas, Petros Zerfos, Junghoo Cho
ELPUB
2007
ACM
14 years 21 days ago
Towards an Ontology of ElPub/SciX: A Proposal
A proposal is presented for a standard ontology language defined as ElPub/SciX Ontology, based on the content of a web digital library of conference proceedings. This content, i.e...
Sely Maria de Souza Costa, Cláudio Gottscha...
KDD
2004
ACM
210views Data Mining» more  KDD 2004»
14 years 9 months ago
Probabilistic author-topic models for information discovery
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...