Sciweavers

240 search results - page 21 / 48
» Learning to Extract Content from News Webpages
Sort
View
CIKM
2005
Springer
14 years 3 months ago
Learning to summarise XML documents using content and structure
Documents formatted in eXtensible Markup Language (XML) are becoming increasingly available in collections of various document types. In this paper, we present an approach for the...
Massih-Reza Amini, Anastasios Tombros, Nicolas Usu...
CIKM
2009
Springer
14 years 2 months ago
Probabilistic models for topic learning from images and captions in online biomedical literatures
Biomedical images and captions are one of the major sources of information in online biomedical publications. They often contain the most important results to be reported, and pro...
Xin Chen, Caimei Lu, Yuan An, Palakorn Achananupar...
SIGIR
2005
ACM
14 years 3 months ago
Title extraction from bodies of HTML documents and its application to web page retrieval
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...
NAACL
2004
13 years 11 months ago
Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization
We consider the problem of modeling the content structure of texts within a specific domain, in terms of the topics the texts address and the order in which these topics appear. W...
Regina Barzilay, Lillian Lee
EMNLP
2010
13 years 7 months ago
Incorporating Content Structure into Text Analysis Applications
In this paper, we investigate how modeling content structure can benefit text analysis applications such as extractive summarization and sentiment analysis. This follows the lingu...
Christina Sauper, Aria Haghighi, Regina Barzilay