Sciweavers

498 search results - page 80 / 100
» Robust web content extraction
Sort
View
SIGIR
2008
ACM
13 years 7 months ago
Comments-oriented document summarization: understanding documents with readers' feedback
Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and ...
Meishan Hu, Aixin Sun, Ee-Peng Lim
ICASSP
2008
IEEE
14 years 1 months ago
Automatic lecture transcription by exploiting presentation slide information for language model adaptation
The paper addresses language model adaptation for automatic lecture transcription by fully exploiting presentation slide information used in the lecture. As the text in the presen...
Tatsuya Kawahara, Yusuke Nemoto, Yuya Akita
WWW
2006
ACM
14 years 8 months ago
Improved annotation of the blogosphere via autotagging and hierarchical clustering
Tags have recently become popular as a means of annotating and organizing Web pages and blog entries. Advocates of tagging argue that the use of tags produces a 'folksonomy&#...
Christopher H. Brooks, Nancy Montanez
WWW
2005
ACM
14 years 8 months ago
Automatically learning document taxonomies for hierarchical classification
While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
Kunal Punera, Suju Rajan, Joydeep Ghosh
MKM
2009
Springer
14 years 1 months ago
From Tessellations to Table Interpretation
The extraction of the relations of nested table headers to content cells is automated with a view to constructing narrow domain ontologies of semistructured web data. A taxonomy of...
Ramana C. Jandhyala, Mukkai S. Krishnamoorthy, Geo...