Sciweavers

129 search results - page 20 / 26
» Combining content extraction heuristics: the CombinE system
Sort
View
ECIR
2008
Springer
13 years 10 months ago
Clustering Template Based Web Documents
More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
Thomas Gottron
IUI
2006
ACM
14 years 2 months ago
Automatically classifying emails into activities
Email-based activity management systems promise to give users better tools for managing increasing volumes of email, by organizing email according to a user’s activities. Curren...
Mark Dredze, Tessa A. Lau, Nicholas Kushmerick
CCGRID
2008
IEEE
14 years 3 months ago
Modeling "Just-in-Time" Communication in Distributed Real-Time Multimedia Applications
—The research area of Multimedia Content Analysis (MMCA) considers all aspects of the automated extraction of new knowledge from large multimedia data streams and archives. In re...
R. Yang, Robert D. van der Mei, D. Roubos, Frank J...
SDM
2009
SIAM
235views Data Mining» more  SDM 2009»
14 years 5 months ago
Topic Cube: Topic Modeling for OLAP on Multidimensional Text Databases.
As the amount of textual information grows explosively in various kinds of business systems, it becomes more and more desirable to analyze both structured data records and unstruc...
ChengXiang Zhai, Duo Zhang, Jiawei Han
WWW
2007
ACM
14 years 9 months ago
Web object retrieval
The primary function of current Web search engines is essentially relevance ranking at the document level. However, myriad structured information about real-world objects is embed...
Zaiqing Nie, Yunxiao Ma, Shuming Shi, Ji-Rong Wen,...