Sciweavers

81 search results - page 9 / 17
» Learning to Separate Text Content and Style for Classificati...
Sort
View
LREC
2008
160views Education» more  LREC 2008»
13 years 9 months ago
Automatic Extraction of Textual Elements from News Web Pages
In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...
Hossam Ibrahim, Kareem Darwish, Abdel-Rahim Madany
SOCIALCOM
2010
13 years 5 months ago
A Methodology for Integrating Network Theory and Topic Modeling and its Application to Innovation Diffusion
Text data pertaining to socio-technical networks often are analyzed separately from relational data, or are reduced to the fact and strength of the flow of information between node...
Jana Diesner, Kathleen M. Carley
KDD
2008
ACM
195views Data Mining» more  KDD 2008»
14 years 8 months ago
Learning from multi-topic web documents for contextual advertisement
Contextual advertising on web pages has become very popular recently and it poses its own set of unique text mining challenges. Often advertisers wish to either target (or avoid) ...
Yi Zhang, Arun C. Surendran, John C. Platt, Mukund...
WWW
2009
ACM
14 years 8 months ago
Purely URL-based topic classification
Given only the URL of a web page, can we identify its topic? This is the question that we examine in this paper. Usually, web pages are classified using their content [7], but a U...
Eda Baykan, Monika Rauch Henzinger, Ludmila Marian...
JIIS
2002
168views more  JIIS 2002»
13 years 7 months ago
Hidden Markov Models for Text Categorization in Multi-Page Documents
In the traditional setting, text categorization is formulated as a concept learning problem where each instance is a single isolated document. However, this perspective is not appr...
Paolo Frasconi, Giovanni Soda, Alessandro Vullo