Sciweavers

120 search results - page 14 / 24
» Conceptual Indexing Using Thematic Representation of Texts
Sort
View
ACSC
2009
IEEE
14 years 1 months ago
A ConceptLink Graph for Text Structure Mining
Most text mining methods are based on representing documents using a vector space model, commonly known as a bag of word model, where each document is modeled as a linear vector r...
Rowena Chau, Ah Chung Tsoi, Markus Hagenbuchner, V...
CIVR
2009
Springer
146views Image Analysis» more  CIVR 2009»
14 years 1 months ago
Web news categorization using a cross-media document graph
In this paper we propose a multimedia categorization framework that is able to exploit information across different parts of a multimedia document (e.g., a Web page, a PDF, a Micr...
José Iria, Fabio Ciravegna, João Mag...
ICDAR
1997
IEEE
13 years 11 months ago
Representing OCRed documents in HTML
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
Tao Hong, Sargur N. Srihari
ICML
1998
IEEE
14 years 7 months ago
Learning a Language-Independent Representation for Terms from a Partially Aligned Corpus
Cross-language latent semantic indexing is a method that learns useful languageindependent vector representations of terms through a statistical analysis of a documentaligned text...
Michael L. Littman, Fan Jiang, Greg A. Keim
CSL
2007
Springer
13 years 6 months ago
Soft indexing of speech content for search in spoken documents
The paper presents the Position Specific Posterior Lattice (PSPL), a novel lossy representation of automatic speech recognition lattices that naturally lends itself to efficient ...
Ciprian Chelba, Jorge Silva, Alex Acero